代码之家 › 专栏 › 技术社区 › serlingpa

如何准备我的数据以避免无法推断频率

gluonts time-series numpy pandas python

serlingpa · 技术社区 · 1 年前

以下代码来自 pandas/tseries/frequencies.py 导致我的代码崩溃:

if not self.is_monotonic or not self.index._is_unique:
    return None

delta = self.deltas[0]
ppd = periods_per_day(self._creso)
if delta and _is_multiple(delta, ppd):
    return self._infer_daily_rule()

# Business hourly, maybe. 17: one day / 65: one weekend
if self.hour_deltas in ([1, 17], [1, 65], [1, 17, 65]):
    return "BH"

# Possibly intraday frequency.  Here we use the
# original .asi8 values as the modified values
# will not work around DST transitions.  See #8772
if not self.is_unique_asi8:
    return None

第一个测试, self.index._is_unique ,合格;第二, not self.is_unique_asi8 ,失败,然后返回 None 。

我看过 this issue 以及相应的PR,但是·

我的代码,它的当前形式,如下所示:

db = Database()
df, last_trade_time = db.fetch_trades()

# Convert the time column to a datetime object with the unit of seconds
df['time'] = pd.to_datetime(df['time'], unit='s')

# Localize the timestamps to UTC
df['time'] = df['time'].dt.tz_localize('UTC')

# Ensure uniqueness by adding the index as nanoseconds
df['time'] = df['time'] + pd.to_timedelta(df.index, unit='ns')

# Set DataFrame index
df.set_index('time', inplace=True)

dataset = PandasDataset(df, target="price")

这些 time s以秒为单位,具有亚纳米精度(来自Kraken)。

我如何准备我的数据?这里只有一个月左右的Python经验。。。

我用另一种形式问了这个问题 here

1 回复 | 直到 1 年前

Phoenix 1 年前

代码似乎无法正确检测数据的频率。在设置索引时,可以使用freq参数显式设置时间序列数据的频率。由于数据以秒为单位,因此可以根据数据的精度将频率指定为“S”(秒)或“L”(毫秒)。尝试使用:

import pandas as pd

db = Database()
df, last_trade_time = db.fetch_trades()

# Convert the time column to a datetime object with the unit of seconds
df['time'] = pd.to_datetime(df['time'], unit='s')

# Localize the timestamps to UTC
df['time'] = df['time'].dt.tz_localize('UTC')

# Set DataFrame index with explicit frequency
df.set_index('time', inplace=True, freq='S')

dataset = PandasDataset(df, target="price")

推荐文章

W. Walter · 熊猫-根据混合频率数据计算月平均值

2 年前

luide · 熊猫前瞻性滚动窗口-参差不齐指数

6 年前

user5458635 · 多输出LSTM时间序列预测

6 年前

Rafael Díaz · 创建具有特定日期的时间序列

6 年前

Manthan mahes wari · 如何将2D pandas阵列适配到Keras LSTM层?

6 年前

Rob · 时间序列图中的重复模式

6 年前

user60856839 · 使用Sparkyr完成时间序列

6 年前

amigo · 将带有时间序列的大型混合CSV导入R

7 年前

ct957 · R中的colsum条件?

7 年前

jamesrogers93 · 用于检索时间序列财务数据摘要的高效Cassandra DB设计

7 年前