代码之家  ›  专栏  ›  技术社区  ›  user3142655

BigQuery:标量子查询生成多个自定义维度

  •  0
  • user3142655  · 技术社区  · 7 年前

    我试图在我的一个并集中获得一个自定义维度,但我遇到了由多个元素生成的标量子查询的问题。我相信问题就在这一准则之内。我正在尝试迁移到标准SQL,所以请用标准SQL给出答案。

    SELECT
          d.value
    FROM
          UNNEST(hits) AS hits,
          UNNEST(hits.customDimensions) AS d
    WHERE
          d.index = 65) AS viewID,
    

    查询的整体示例

    #standardSQL
    SELECT
      date,
      channelGrouping,
      viewID,
      SUM(Revenue) Revenue,
      SUM(Shipping) Shipping,
      SUM(bounces) bounces,
      SUM(transactions) transactions,
      COUNT(date) sessions
    FROM (
      SELECT
        date,
        channelGrouping,
        'XXXXXXXXX' AS viewID,
        totals.totaltransactionrevenue / 1e6 Revenue,
        (
        SELECT
          SUM(hits.transaction.transactionshipping) / 1e6
        FROM
          UNNEST(hits) hits) Shipping,
        totals.bounces bounces,
        totals.transactions transactions
      FROM
        `XXXXXXXXX.ga_sessions_*`
      WHERE
        _TABLE_SUFFIX BETWEEN '20170625'
        AND '20170703'
      UNION ALL
      SELECT
        date,
        channelGrouping,
        'XXXXXXXXX' AS viewID,
        totals.totaltransactionrevenue / 1e6 Revenue,
        (
        SELECT
          SUM(hits.transaction.transactionshipping) / 1e6
        FROM
          UNNEST(hits) hits) Shipping,
        totals.bounces bounces,
        totals.transactions transactions
      FROM
        `XXXXXXXXX.ga_sessions_*`
      WHERE
        _TABLE_SUFFIX BETWEEN '20170625'
        AND '20170703'
      UNION ALL
      SELECT
        date,
        channelGrouping,
        (
        SELECT
          d.value
        FROM
          UNNEST(hits) AS hits,
          UNNEST(hits.customDimensions) AS d
        WHERE
          d.index = 65) AS viewID,
        totals.totaltransactionrevenue / 1e6 Revenue,
        (
        SELECT
          SUM(hits.transaction.transactionshipping) / 1e6
        FROM
          UNNEST(hits) hits) Shipping,
        totals.bounces bounces,
        totals.transactions transactions
      FROM
        `XXXXXXXXX.ga_sessions_*`
      WHERE
        _TABLE_SUFFIX BETWEEN '20170625'
        AND '20170703'
      UNION ALL
      SELECT
        date,
        channelGrouping,
        'XXXXXXXXX' AS viewID,
        totals.totaltransactionrevenue / 1e6 Revenue,
        (
        SELECT
          SUM(hits.transaction.transactionshipping) / 1e6
        FROM
          UNNEST(hits) hits) Shipping,
        totals.bounces bounces,
        totals.transactions transactions
      FROM
        `XXXXXXXXX.ga_sessions_*`
      WHERE
        _TABLE_SUFFIX BETWEEN '20170625'
        AND '20170703'
      UNION ALL
      SELECT
        date,
        channelGrouping,
        'XXXXXXXXX' AS viewID,
        totals.totaltransactionrevenue / 1e6 Revenue,
        (
        SELECT
          SUM(hits.transaction.transactionshipping) / 1e6
        FROM
          UNNEST(hits) hits) Shipping,
        totals.bounces bounces,
        totals.transactions transactions
      FROM
        `XXXXXXXXX.ga_sessions_*`
      WHERE
        _TABLE_SUFFIX BETWEEN '20170625'
        AND '20170703' )
    GROUP BY
      date,
      channelGrouping,
      viewID
    
    2 回复  |  直到 7 年前
        1
  •  3
  •   Willian Fuks    7 年前

    您可以在BigQuery中模拟一些数据,以更好地了解这里发生了什么。

    例如,该数据模拟 hits 中的架构 ga_sessions :

    WITH data AS(
      select ARRAY<STRUCT<hitNumber INT64, customDimensions ARRAY<STRUCT<index INT64, value STRING>> >> [STRUCT(1 as hitNumber, [STRUCT(1 as index, 'val1' as value), STRUCT(2 as index, 'val2' as value), STRUCT(3 as index, 'val3' as value)] as customDimensions), STRUCT(2 as hitNumber, [STRUCT(1 as index, 'val1' as value)] as customDimensions)] hits
    )
    
    select * from data
    

    index = 1 ,您将得到相同的错误,因为在两个不同的位置,索引为1。

    ARRAY 像这样:

    SELECT
      array(select custd.value from unnest(hits) hits, unnest(hits.customDimensions) custd where index = 1)
    FROM data
    

    你会看到结果:

    enter image description here

    因此,在查询中,您必须适应将此值作为 或者,如果对于所有值 index=65 值相同,您可以执行以下操作:

    SELECT
      (select custd.value from unnest(hits) hits, unnest(hits.customDimensions) custd where index = 1 limit 1)
    FROM data
    

    这将在标量子查询中只产生一个结果。

        2
  •  2
  •   Elliott Brossard    7 年前

    问题是,一些或所有点击都有一个索引为65的自定义维度。有几种不同的方法来解决这个问题。可以使用数组子查询获取该索引的所有值:

    ARRAY(
      SELECT
        d.value
      FROM
        UNNEST(hits) AS hits,
        UNNEST(hits.customDimensions) AS d
      WHERE
        d.index = 65) AS viewIDs,
    

    这将为您提供跨点击的所有视图ID,但您还需要在联合的第一个查询中为viewID使用一个数组。另一种选择是仅从第一次点击中获取视图ID:

    (
      SELECT
        d.value
      FROM
        UNNEST(hits[SAFE_OFFSET(0)].customDimensions) AS d
      WHERE
        d.index = 65) AS viewID
    

    (
      SELECT
        d.value
      FROM
        UNNEST(hits) AS hits,
        UNNEST(hits.customDimensions) AS d
      WHERE
        d.index = 65
      LIMIT 1) AS viewID,