代码之家  ›  专栏  ›  技术社区  ›  Vint

在时间间隙插入NA

  •  1
  • Vint  · 技术社区  · 6 年前

    我有一个时间序列数据集,我正试图绘制出来,但时间序列有很大的数据间隙。当绘制这个数据集时,r用一条直线在这些间隙上绘制,我更希望最终的绘制不在这些数据间隙上绘制。我知道的唯一解决方法是在数据集中的间隙之间手动插入一行na。为此,我编写了一个循环遍历数据帧的函数。这个函数可以工作,但是运行起来非常慢。

    #Define Function to Insert NA in long stretches so don't plot line 
    
    PlotSpace<-function(DF){
    
    NROW<-nrow(DF)
    for(t in seq(1:NROW)){
    
    TimeDiff<-difftime(DF$TimeStamp[t+1], DF$TimeStamp[t], units = "hours") 
    DF[t,"TimeDiff"]<-TimeDiff  
    
    if( !is.na(TimeDiff) & TimeDiff > 5 ){
    
        NewTimeStamp<-DF$TimeStamp[t] + 1
        NewProfStamp<-DF$ProfStamp[t] + 1
        print(NewTimeStamp)
        DF<-rbind(DF,NA) #Add last row that is NA
        DF[nrow(DF),'TimeStamp']<-NewTimeStamp
        DF[nrow(DF),'ProfStamp']<-NewProfStamp
    
    
    
    
        }
    }
    
    DF <- DF[order(DF$TimeStamp),]
    DF<-DF[-1,]
    DF$TimeStamp<-as.POSIXct(DF$TimeStamp)
    
    return(DF)
    
    }
    

    有没有更有效的方法在r中实现这一点?

    示例数据:

     TimeStamp<-c("2015-05-01 10:00:00","2015-05-01 10:05:00","2015-05-01 10:10:00","2015-05-01 10:15:00",
    "2015-05-01 10:20:00","2015-05-01 15:00:00","2015-05-01 15:05:00","2015-05-01 15:10:00"
    ,"2015-05-01 15:20:00","2015-05-01 15:30:00","2015-05-01 15:35:00")
    
    Data<-c(1,2,3,4,5,3,7,8,9,2,11)
    
    DF<-data.frame(TimeStamp, Data)
    
    DF$TimeStamp<-as.POSIXct(DF$TimeStamp)
    
    plot(DF$TimeStamp, DF$Data, type='l')
    

    如您所见,上面的图表绘制了7小时数据间隔之间的一条线。我想在任何时间间隔大于2小时插入NA。阿卡

                 TimeStamp Data
    1  2015-05-01 10:00:00    1
    2  2015-05-01 10:05:00    2
    3  2015-05-01 10:10:00    3
    4  2015-05-01 10:15:00    4
    5  2015-05-01 10:20:00    5
       2015-05-01 10:21:00    NA
    6  2015-05-01 15:00:00    3
    7  2015-05-01 15:05:00    7
    8  2015-05-01 15:10:00    8
    9  2015-05-01 15:20:00    9
    10 2015-05-01 15:30:00    2
    11 2015-05-01 15:35:00   11
    
    1 回复  |  直到 6 年前
        1
  •  1
  •   IceCreamToucan    6 年前
    library(data.table)
    setDT(DF)
    
    # Create indicator for time gap
    DF[, gap := c(diff(TimeStamp) > 2*60, F)]
    # If there's a gap, add a new row
    DF[, if(gap) rbind(.SD, .(TimeStamp = TimeStamp + 60), fill = T)
         else .SD
       , by = 1:nrow(DF)
       ][, -'gap']
    #               TimeStamp Data
    #  1: 2015-05-01 10:00:00    1
    #  2: 2015-05-01 10:05:00    2
    #  3: 2015-05-01 10:10:00    3
    #  4: 2015-05-01 10:15:00    4
    #  5: 2015-05-01 10:20:00    5
    #  6: 2015-05-01 10:21:00   NA
    #  7: 2015-05-01 15:00:00    3
    #  8: 2015-05-01 15:05:00    7
    #  9: 2015-05-01 15:10:00    8
    # 10: 2015-05-01 15:20:00    9
    # 11: 2015-05-01 15:30:00    2
    # 12: 2015-05-01 15:35:00   11