代码之家  ›  专栏  ›  技术社区  ›  Ricardo

像SQL Server这样的带有join-in-from子句的select子句中如何执行postgresql子查询?

  •  65
  • Ricardo  · 技术社区  · 14 年前

    我尝试在PostgreSQL上编写以下查询:

    select name, author_id, count(1), 
        (select count(1)
        from names as n2
        where n2.id = n1.id
            and t2.author_id = t1.author_id
        )               
    from names as n1
    group by name, author_id
    

    这当然可以在Microsoft SQL Server上工作,但在PosteGSQL上根本不起作用。我读了一下它的文档,似乎可以重写为:

    select name, author_id, count(1), total                     
    from names as n1, (select count(1) as total
        from names as n2
        where n2.id = n1.id
            and n2.author_id = t1.author_id
        ) as total
    group by name, author_id
    

    但在postegressql上返回以下错误:“从中的子查询不能引用同一查询级别的其他关系”。所以我被卡住了。有人知道我怎样才能做到吗?

    谢谢

    5 回复  |  直到 5 年前
        1
  •  93
  •   Bob Jarvis - Слава Україні    14 年前

    我不确定我完全理解你的意图,但也许下面的内容接近你想要的:

    select n1.name, n1.author_id, count_1, total_count
      from (select id, name, author_id, count(1) as count_1
              from names
              group by id, name, author_id) n1
    inner join (select id, author_id, count(1) as total_count
                  from names
                  group by id, author_id) n2
      on (n2.id = n1.id and n2.author_id = n1.author_id)
    

    不幸的是,这增加了按ID、名称和作者ID对第一个子查询进行分组的要求,我认为这是不需要的。不过,我不确定如何解决这个问题,因为您需要有ID来加入第二个子查询。也许其他人会想出更好的解决方案。

    分享和享受。

        2
  •  12
  •   Ricardo    14 年前

    我只是在这里回答我需要的基于Bob Jarvis答案的最终SQL的格式化版本,如我在上面的评论中所发布的:

    select n1.name, n1.author_id, cast(count_1 as numeric)/total_count
      from (select id, name, author_id, count(1) as count_1
              from names
              group by id, name, author_id) n1
    inner join (select author_id, count(1) as total_count
                  from names
                  group by author_id) n2
      on (n2.author_id = n1.author_id)
    
        3
  •  5
  •   dmikam    6 年前

    我知道这是旧的,但是自从 Postgresql 9.3 有一个选项可以使用关键字“lateral”在联接内部使用相关的子查询,因此来自问题的查询如下所示:

    SELECT 
        name, author_id, count(*), t.total
    FROM
        names as n1
        INNER JOIN LATERAL (
            SELECT 
                count(*) as total
            FROM 
                names as n2
            WHERE 
                n2.id = n1.id
                AND n2.author_id = n1.author_id
        ) as t ON 1=1
    GROUP BY 
        n1.name, n1.author_id
    
        4
  •  5
  •   deFreitas    5 年前

    补足 @ Bob Jarvis 德米坎 回答:如果不使用横向,Postgres就不能执行一个好的计划,在模拟下,在两种情况下,查询数据的结果是相同的,但是成本是非常不同的

    表结构

    CREATE TABLE ITEMS (
        N INTEGER NOT NULL,
        S TEXT NOT NULL
    );
    
    INSERT INTO ITEMS
      SELECT
        (random()*1000000)::integer AS n,
        md5(random()::text) AS s
      FROM
        generate_series(1,1000000);
    
    CREATE INDEX N_INDEX ON ITEMS(N);
    

    表演 JOIN 具有 GROUP BY 在没有的子查询中 LATERAL

    EXPLAIN 
    SELECT 
        I.*
    FROM ITEMS I
    INNER JOIN (
        SELECT 
            COUNT(1), n
        FROM ITEMS
        GROUP BY N
    ) I2 ON I2.N = I.N
    WHERE I.N IN (243477, 997947);
    

    结果

    Merge Join  (cost=0.87..637500.40 rows=23 width=37)
      Merge Cond: (i.n = items.n)
      ->  Index Scan using n_index on items i  (cost=0.43..101.28 rows=23 width=37)
            Index Cond: (n = ANY ('{243477,997947}'::integer[]))
      ->  GroupAggregate  (cost=0.43..626631.11 rows=861418 width=12)
            Group Key: items.n
            ->  Index Only Scan using n_index on items  (cost=0.43..593016.93 rows=10000000 width=4)
    

    使用 侧面的

    EXPLAIN 
    SELECT 
        I.*
    FROM ITEMS I
    INNER JOIN LATERAL (
        SELECT 
            COUNT(1), n
        FROM ITEMS
        WHERE N = I.N
        GROUP BY N
    ) I2 ON 1=1 --I2.N = I.N
    WHERE I.N IN (243477, 997947);
    

    结果

    Nested Loop  (cost=9.49..1319.97 rows=276 width=37)
      ->  Bitmap Heap Scan on items i  (cost=9.06..100.20 rows=23 width=37)
            Recheck Cond: (n = ANY ('{243477,997947}'::integer[]))
            ->  Bitmap Index Scan on n_index  (cost=0.00..9.05 rows=23 width=0)
                  Index Cond: (n = ANY ('{243477,997947}'::integer[]))
      ->  GroupAggregate  (cost=0.43..52.79 rows=12 width=12)
            Group Key: items.n
            ->  Index Only Scan using n_index on items  (cost=0.43..52.64 rows=12 width=4)
                  Index Cond: (n = i.n)
    

    我的Postgres版本是 PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1)

        5
  •  0
  •   sanyassh Khushboo Tahir    5 年前
    select n1.name, n1.author_id, cast(count_1 as numeric)/total_count
      from (select id, name, author_id, count(1) as count_1
              from names
              group by id, name, author_id) n1
    inner join (select distinct(author_id), count(1) as total_count
                  from names) n2
      on (n2.author_id = n1.author_id)
    Where true
    

    习惯于 distinct 如果更多的内部联接,因为更多的联接组性能很慢