您可以使用
rle()
关于每行的总和。
x <- structure(
c(10L, 10L, 11L, 12L, 10L, 10L, 10L, 11L, 11L, 12L, 13L, 11L, 11L, 11L),
.Dim = c(7L, 2L), .Dimnames = list(NULL, c("Bid", "Ask")),
index = structure(1:7, tzone = "", tclass = c("POSIXct", "POSIXt")),
.indexCLASS = c("POSIXct", "POSIXt"), .indexTZ = "",
tclass = c("POSIXct", "POSIXt"), tzone = "", class = c("xts", "zoo"))
r <- rle(rowSums(x))
如果你想要每组中的最后一个观察结果,你可以使用
cumsum(r$lengths)
R> x[cumsum(r$lengths),]
Bid Ask
1969-12-31 18:00:02 10 11
1969-12-31 18:00:03 11 12
1969-12-31 18:00:04 12 13
1969-12-31 18:00:07 10 11
由于您希望对每组进行第一次观察,因此需要预先准备
r$lengths
1
(您总是需要第一次观察),然后删除
r$长度
. 然后打电话
cumsum()
关于结果。
R> x[cumsum(c(1, head(r$lengths, -1))),]
Bid Ask
1969-12-31 18:00:01 10 11
1969-12-31 18:00:03 11 12
1969-12-31 18:00:04 12 13
1969-12-31 18:00:05 10 11
很好地抓住了
rowSums()
. 稳健的解决方案是
diff()
bids和ASK并选择其中一行不为零的行。
d <- diff(x) != 0 # rows with price changes
d[1,] <- TRUE # always select first observation
g <- cumsum(d$Bid | d$Ask) # groups of repeats
r <- rle(as.numeric(g)) # run length encoding on groups
# now use the solution above
x[cumsum(c(1, head(r$lengths, -1))),]