[Solved] How to get quick summary in data.table with a look-back window?


I think I understand your request. You seem to care about the order of the observations regardless if, for instance, the second observations Time is prior to the first observations Time. That doesn’t make much sense, but here is a quit efficient data.table solution in order to achieve this. This is basically does a non-equi join by ID, Col, Both Time columns and the row index (which is basically the appearance order). Afterwards, it just dcast to convert from long to wide (like in your previous question). Note that the result is ordered by the dates, but I’ve kept the rowindx variable, so you can reorder it back using setorder. Also, I’ll keep the ratio calc to you as this is very basic (hint – Don’t use loops, it is a fully vectorized one liner)

library(data.table) #v1.10.4+

## Read the data
DT <- fread("ID   Time        Col   Count 
A    2017-06-05   M      1
A    2017-06-02   M      1
A    2017-06-03   M      1
B    2017-06-02   K      1
B    2017-06-01   M      4")

## Prepare the variables we need for the join
DT[, Time := as.IDate(Time)]
DT[, Time_2D := Time - 2L]
DT[, rowindx := .I]

## Non-equi join, sum `Count` by each join
DT2 <- DT[DT, 
          sum(Count), 
          on = .(ID, Col, rowindx <= rowindx, Time <= Time, Time >= Time_2D),
          by = .EACHI]

## Fix column names (a known issue)
setnames(DT2, make.unique(names(DT2)))

## Long to wide (You can reorder back using `rowindx` and `setorder` function)
dcast(DT2, ID + Time + Time.1 + rowindx ~ Col, value.var = "V1", fill = 0)
#    ID       Time     Time.1 rowindx K M
# 1:  A 2017-06-02 2017-05-31       2 0 1
# 2:  A 2017-06-03 2017-06-01       3 0 2
# 3:  A 2017-06-05 2017-06-03       1 0 1
# 4:  B 2017-06-01 2017-05-30       5 0 4
# 5:  B 2017-06-02 2017-05-31       4 1 0

4

solved How to get quick summary in data.table with a look-back window?