代码之家  ›  专栏  ›  技术社区  ›  IMCoins

截断DateTimeIndex之外的毫秒数

  •  1
  • IMCoins  · 技术社区  · 6 年前

    当我使用 pandas.date_range() ,我有时会有很多毫秒的时间戳,我不想保留这些时间戳。

    假设我这样做。。。

    import pandas as pd
    dr = pd.date_range('2011-01-01', '2011-01-03', periods=15)
    >>> dr
    DatetimeIndex([          '2011-01-01 00:00:00',
                   '2011-01-01 03:25:42.857142784',
                   '2011-01-01 06:51:25.714285824',
                   '2011-01-01 10:17:08.571428608',
                   '2011-01-01 13:42:51.428571392',
                   '2011-01-01 17:08:34.285714176',
                   '2011-01-01 20:34:17.142857216',
                             '2011-01-02 00:00:00',
                   '2011-01-02 03:25:42.857142784',
                   '2011-01-02 06:51:25.714285824',
                   '2011-01-02 10:17:08.571428608',
                   '2011-01-02 13:42:51.428571392',
                   '2011-01-02 17:08:34.285714176',
                   '2011-01-02 20:34:17.142857216',
                             '2011-01-03 00:00:00'],
                  dtype='datetime64[ns]', freq=None)
    

    为了忽略当前毫秒数,我不得不这样做。

    >>> t = []
    >>> for item in dr:
    ...  idx = str(item).find('.')
    ...  if idx != -1:
    ...   item = str(item)[:idx]
    ...  t.append(pd.to_datetime(item))
    ...
    >>> t
    [Timestamp('2011-01-01 00:00:00'), 
     Timestamp('2011-01-01 03:25:42'), 
     Timestamp('2011-01-01 06:51:25'), 
     Timestamp('2011-01-01 10:17:08'), 
     Timestamp('2011-01-01 13:42:51'), 
     Timestamp('2011-01-01 17:08:34'), 
     Timestamp('2011-01-01 20:34:17'), 
     Timestamp('2011-01-02 00:00:00'), 
     Timestamp('2011-01-02 03:25:42'), 
     Timestamp('2011-01-02 06:51:25'), 
     Timestamp('2011-01-02 10:17:08'), 
     Timestamp('2011-01-02 13:42:51'), 
     Timestamp('2011-01-02 17:08:34'), 
     Timestamp('2011-01-02 20:34:17'), 
     Timestamp('2011-01-03 00:00:00')]
    

    有更好的方法吗? 我已经试过了。。。

    1. dr = [ pd.to_datetime(item, format='%Y-%m-%d %H:%M:%S') for item in dr ]

    但它什么都没做。

    1. (pd.date_range('2011-01-01', '2011-01-03', periods=15)).astype('datetime64[s]')

    但它说它不能投。

    1. dr = (dr.to_series()).apply(lambda x:x.replace(microseconds=0))

    但这句话并不能解决我的问题,因为。。。

    2018-04-17 15:07:04.777777664 gives --> 2018-04-17 15:07:04.000000664
    
    1 回复  |  直到 5 年前
        1
  •  1
  •   jezrael    6 年前

    我相信需要 DatetimeIndex.floor :

    print (dr.floor('S'))
    DatetimeIndex(['2011-01-01 00:00:00', '2011-01-01 03:25:42',
                   '2011-01-01 06:51:25', '2011-01-01 10:17:08',
                   '2011-01-01 13:42:51', '2011-01-01 17:08:34',
                   '2011-01-01 20:34:17', '2011-01-02 00:00:00',
                   '2011-01-02 03:25:42', '2011-01-02 06:51:25',
                   '2011-01-02 10:17:08', '2011-01-02 13:42:51',
                   '2011-01-02 17:08:34', '2011-01-02 20:34:17',
                   '2011-01-03 00:00:00'],
                  dtype='datetime64[ns]', freq=None)