google的BigQuery存储了时间序列交易历史记录。
# Transaction history scheme
exchange_id INTEGER REQUIRED
from_id INTEGER REQUIRED
to_id INTEGER REQUIRED
price FLOAT REQUIRED
size FLOAT REQUIRED
ts TIMESTAMP REQUIRED
is_sell BOOLEAN NULLABLE
_PARTITIONTIME TIMESTAMP NULLABLE
exchange\u id
-发生交易的平台
from\u id
-基本符号
to\U id
-报价符号
价格
-交易价格,交易价格
大小
-交易数量
我需要聚合
OHLC
超过30秒时间间隔的数据按分组
exchange_id, from_id, to_id
. 如何在BigQuery中执行此操作?
# Required OHLC aggregated data scheme
ts TIMESTAMP REQUIRED
exchange_id INTEGER REQUIRED
from_id INTEGER REQUIRED
to_id INTEGER REQUIRED
open FLOAT REQUIRED
high FLOAT REQUIRED
low FLOAT REQUIRED
close FLOAT REQUIRED
volume FLOAT REQUIRED
_PARTITIONTIME TIMESTAMP NULLABLE
打开
-区间首价
高的
-最高价格。。
低的
-最低价格。。
关
-最后价格。。
体积
-当前时间间隔内所有交易规模的总和
最有希望的想法是:
SELECT
TIMESTAMP_SECONDS(
UNIX_SECONDS(ts) -
60 * 1000000
) AS time,
exchange_id,
from_id,
to_id,
MIN(price) as low,
MAX(price) as high,
SUM(size) as volume
FROM
`table`
GROUP BY
time, exchange_id, from_id, to_id
ORDER BY
time
还有这个:
SELECT
exchange_id,from_id,to_id,
MAX(price) OVER (PARTITION BY exchange_id,from_id,to_id ORDER BY ts RANGE BETWEEN 60 * 1000000 PRECEDING AND CURRENT ROW) as high,
MIN(price) OVER (PARTITION BY exchange_id,from_id,to_id ORDER BY ts RANGE BETWEEN 60 * 1000000 PRECEDING AND CURRENT ROW) as low,
SUM(size) OVER (PARTITION BY exchange_id,from_id,to_id ORDER BY ts RANGE BETWEEN 60 * 1000000 PRECEDING AND CURRENT ROW) as volume,
FROM [table];
# returns:
1 1 4445 3808 9.0E-8 9.0E-8 300000.0
2 1 4445 3808 9.0E-8 9.0E-8 300000.0
3 1 4445 3808 9.0E-8 9.0E-8 300000.0
...
14 1 4445 3808 9.0E-8 9.0E-8 865939.3721800799
15 1 4445 3808 9.0E-8 9.0E-8 865939.3721800799
16 1 4445 3808 9.0E-8 9.0E-8 865939.3721800799
但这些都不管用。似乎我遗漏了BigQuery中滑动窗口的一些重要内容。