看起来一种方法是将“原始”数据作为字符串获取,然后使用
JSON functions
在消费者物化视角下。
WITH '{"ts": 1598033988, "deviceId": "cf060111-dbe6-4aa8-a2d0-d5aa17f45663", "location": [39.920515, 32.853708], "stats": { "temp": 71.2, "total_memory": 32, "used_memory": 21.2 }}' AS raw
SELECT
JSONExtractUInt(raw, 'ts') AS ts,
JSONExtractString(raw, 'deviceId') AS deviceId,
arrayMap(x -> toFloat32(x), JSONExtractArrayRaw(raw, 'location')) AS location,
JSONExtract(raw, 'stats', 'Tuple(temp Float64, total_memory Float64, used_memory Float64)') AS stats,
stats.1 AS temp,
stats.2 AS total_memory,
stats.3 AS used_memory;
备注:对于带浮点数的数字,应使用类型
浮点数64
不
浮子32
(见相关
CH Issue 13962
).
使用标准数据类型需要更改JSON的模式:
-
代表
统计数据
作为
Tuple
CREATE TABLE test_tuple_field
(
ts Int64,
deviceId String,
location Array(Float32),
stats Tuple(Float32, Float32, Float32)
) ENGINE = MergeTree()
ORDER BY ts;
INSERT INTO test_tuple_field FORMAT JSONEachRow
{ "ts": 1598033988, "deviceId": "cf060111-dbe6-4aa8-a2d0-d5aa17f45663", "location": [39.920515, 32.853708], "stats": [71.2, 32, 21.2]};
-
代表
统计数据
作为
Nested Structure
CREATE TABLE test_nested_field
(
ts Int64,
deviceId String,
location Array(Float32),
stats Nested (temp Float32, total_memory Float32, used_memory Float32)
) ENGINE = MergeTree()
ORDER BY ts;
SET input_format_import_nested_json=1;
INSERT INTO test_nested_field FORMAT JSONEachRow
{ "ts": 1598033988, "deviceId": "cf060111-dbe6-4aa8-a2d0-d5aa17f45663", "location": [39.920515, 32.853708], "stats": { "temp": [71.2], "total_memory": [32], "used_memory": [21.2] }};
查看相关答案
ClickHouse JSON parse exception: Cannot parse input: expected ',' before
.