代码之家  ›  专栏  ›  技术社区  ›  sapi

如何编写引用聚合原始表的子查询?

  •  0
  • sapi  · 技术社区  · 4 年前

    我有两张表格追踪比赛细节和每支球队在比赛中的表现。模式基本上如下所示:

    Game
    - id
    - date [TIMESTAMPTZ]
    - team_a_id
    - team_b_id
    
    TeamStats
    - game_id
    - team_id
    - stat_a [INTEGER]
    - stat_b [INTEGER]
    

    对于每一场比赛,我想总结一下每一支球队在之前所有比赛中的表现。所需的输出如下所示,在给定游戏id之前的所有游戏中取平均列:

    - game_id
    - team_a_avg_stat_a
    - team_a_avg_stat_b
    - team_b_avg_stat_a
    - team_b_avg_stat_b
    

    我原以为我需要一个类似于以下内容的查询,将游戏表加入到一个查询中,该查询在给定的时间段内平均给定球队的统计数据:

    -- Example for team_a, would repeat with another join for team_b
    SELECT g.id, ats.avg_stat_a as team_a_avg_stat_a, ats.avg_stat_b as team_a_avg_stat_b
    FROM game g
        INNER JOIN LATERAL (
            SELECT game_id, AVG(stat_a) AS avg_stat_a, AVG(stat_b) as avg_stat_b
            FROM teamstats its
                INNER JOIN game ig
                    ON its.game_id = ig.id
            WHERE ig.date < g.date AND its.team_id = g.team_a_id
            GROUP BY its.game_id
            ) ats
            ON ats.game_id = g.id;
    

    然而,当我尝试上面的查询时,我得到的结果为零。我本以为游戏桌上的每一排都会有一个结果。

    我最初的尝试实际上没有横向连接——但当我尝试横向连接时,我收到了一条错误消息,它将我带到了相关子查询的路径上:

    /* ERROR:  invalid reference to FROM-clause entry for table "g"
    LINE 8:   WHERE ig.date < g.date AND its.team_id = g.team_a_id
                              ^
    HINT:  There is an entry for table "g", but it cannot be referenced from this part of the query. */
    

    我错过了什么?


    此外,有一个限制我一开始就忘了提到——我希望能够将平均值限制为仅考虑比赛日期的特定时间段(比如90天)内的日期。

    2 回复  |  直到 4 年前
        1
  •  2
  •   GMB    4 年前

    我认为你可以使用窗口功能,但你需要一个行框,这样就只考虑以前的游戏了:

    select g.id, 
        avg(ta.stats_a) over(
            partition by tsa.team_id 
            order by g.date rows between unbounded preceding and 1 preceding
        ) team_a_avg_stat_a,
        avg(ta.stats_b) over(
            partition by tsa.team_id 
            order by g.date rows between unbounded preceding and 1 preceding
        ) team_a_avg_stat_b,
        avg(tb.stats_a) over(
            partition by tsb.team_id 
            order by g.date rows between unbounded preceding and 1 preceding
        ) team_b_avg_stat_a,
        avg(ta.stats_b) over(
            partition by tsb.team_id 
            order by g.date rows between unbounded preceding and 1 preceding
        ) team_b_avg_stat_b
    from game g
    inner join teamstats tsa 
        on  tsa.game_id = g.game_id
        and tsa.team_id = g.team_a_id
    inner join teamstats tsb
        on  tsb.game_id = g.game_id
        and tsb.team_id = g.team_b_id
    
        2
  •  1
  •   Gordon Linoff    4 年前

    嗯。我认为窗口功能可以满足您的需求:

    select g.*,
           avg(ts_a.stat_a) over (partition by ts_a.team_id order by g.date) as avg_a_a,
           avg(ts_a.stat_b) over (partition by ts_a.team_id order by g.date) as avg_a_b,
           avg(ts_b.stat_a) over (partition by ts_a.team_id order by g.date) as avg_b_a,
           avg(ts_b.stat_b) over (partition by ts_a.team_id order by g.date) as avg_b_b
    from game g join
         teamstats ts_a
         on ts_a.game_id = g.id and ts_a.team_id =  g.team_a_id join
         teamstats ts_b
         on ts_b.game_id = g.id and ts_b.team_id =  g.team_b_id