代码之家  ›  专栏  ›  技术社区  ›  Oto Shavadze

不同值的查询执行方式不同

  •  3
  • Oto Shavadze  · 技术社区  · 6 年前

    1)

    create table test_users(
       id serial primary key
    );
    
    insert into test_users
    select * from generate_series(1,750000);
    

    2)

    create table test_posts(
      id bigserial primary key,
      user_id INT
    );
    
    CREATE index idx ON test_posts(user_id, id);
    
    INSERT INTO test_posts (user_id) SELECT 2 FROM generate_series (1,30038);
    INSERT INTO test_posts (user_id) SELECT 3 FROM generate_series (1,31036);
    INSERT INTO test_posts (user_id) SELECT 4 FROM generate_series (1,24032);
    INSERT INTO test_posts (user_id) SELECT 5 FROM generate_series (1,93094);
    INSERT INTO test_posts (user_id) SELECT 6 FROM generate_series (1,31033);
    INSERT INTO test_posts (user_id) SELECT 7 FROM generate_series (1,62063);
    INSERT INTO test_posts (user_id) SELECT 724309 FROM generate_series (1,2);
    INSERT INTO test_posts (user_id) SELECT 724306 FROM generate_series (1,1);
    

    id 每个用户,从 test_posts

    SELECT user_id, count(*) , max(id) as maxid
    FROM test_posts 
    GROUP BY user_id
    order by count(*)
    

    所以, user_id = 724306 271299 .

    user_id = 5 在我的例子中,maxid是: 178200

    现在我运行以下查询:

    SELECT test_posts.id
    FROM test_posts 
    JOIN test_users
    ON test_posts.user_id = test_users.id 
    WHERE test_posts.id = 271299  
    

    explain analyze 显示:

    "Merge Join  (cost=8.88..9.06 rows=1 width=8) (actual time=146.561..146.563 rows=1 loops=1)"
    "  Merge Cond: (test_users.id = test_posts.user_id)"
    "  ->  Index Only Scan using test_users_pkey on test_users  (cost=0.42..22808.42 rows=750000 width=4) (actual time=0.012..110.091 rows=724307 loops=1)"
    "        Heap Fetches: 724307"
    "  ->  Sort  (cost=8.45..8.46 rows=1 width=12) (actual time=0.012..0.013 rows=1 loops=1)"
    "        Sort Key: test_posts.user_id"
    "        Sort Method: quicksort  Memory: 25kB"
    "        ->  Index Scan using test_posts_pkey on test_posts  (cost=0.42..8.44 rows=1 width=12) (actual time=0.007..0.007 rows=1 loops=1)"
    "              Index Cond: (id = 271299)"
    "Planning time: 0.214 ms"
    "Execution time: 146.588 ms"
    

    现在运行相同的查询,只需使用178200:

    SELECT test_posts.id
    FROM test_posts 
    JOIN test_users
    ON test_posts.user_id = test_users.id 
    WHERE test_posts.id = 178200  
    

    解释分析

    "Merge Join  (cost=8.88..9.06 rows=1 width=8) (actual time=0.040..0.042 rows=1 loops=1)"
    "  Merge Cond: (test_users.id = test_posts.user_id)"
    "  ->  Index Only Scan using test_users_pkey on test_users  (cost=0.42..22808.42 rows=750000 width=4) (actual time=0.016..0.018 rows=6 loops=1)"
    "        Heap Fetches: 6"
    "  ->  Sort  (cost=8.45..8.46 rows=1 width=12) (actual time=0.017..0.018 rows=1 loops=1)"
    "        Sort Key: test_posts.user_id"
    "        Sort Method: quicksort  Memory: 25kB"
    "        ->  Index Scan using test_posts_pkey on test_posts  (cost=0.42..8.44 rows=1 width=12) (actual time=0.011..0.012 rows=1 loops=1)"
    "              Index Cond: (id = 178200)"
    "Planning time: 0.249 ms"
    "Execution time: 0.072 ms"
    

    是什么导致执行时间上的差异?

    1 回复  |  直到 6 年前
        1
  •  1
  •   wildplasser    6 年前

    下面的技巧将使第一个查询再次变快。


    EXPLAIN (ANALYZE, BUFFERS)
    SELECT p.id
    FROM test_posts p
    LEFT JOIN test_users u
    ON p.user_id = u.id AND u.id IS NOT NULL
    WHERE p.id = 271299
            ;
    

    10.3也确认了。