代码之家  ›  专栏  ›  技术社区  ›  Guig

有效利用bigquery的近似分位数

  •  0
  • Guig  · 技术社区  · 6 年前

    现在如果我想得到某个数值的十分位数

    SELECT
        APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(10)] as p10,
        APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(20)] as p20,
        APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(30)] as p30,
        APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(40)] as p40,
        APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(50)] as p50,
        APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(60)] as p60,
        APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(70)] as p70,
        APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(80)] as p80,
        APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(90)] as p90,
        APPROX_QUANTILES(value, 100)[SAFE_ORDINAL(100)] as p100
    FROM table
    

    我想确保这不是10大查询的工作,如果有更紧凑的方式来写这个。

    1 回复  |  直到 6 年前
        1
  •  2
  •   Elliott Brossard    6 年前

    如果运行查询并检查执行计划,您将看到bigquery只计算一次分位数,然后在第二步中提取数组的各个元素。您不必担心尝试重复数据消除 APPROX_QUANTILES 聚集你自己。

    Sample execution plan

    推荐文章