我有一个类似这样的查询
SELECT DISTINCT
COALESCE(fa.id, fb.id) AS id,
COALESCE(fa.d_id, fb.d_id) AS d_id,
COALESCE(fa.name, fb.name) AS name,
COALESCE(fa.disabled, fb.disabled) AS disabled,
COALESCE(fa.deleted, fb.deleted) AS deleted
FROM (
SELECT * from table WHERE name LIKE '%'
AND d_id IS NULL AND deleted = false
) fa
FULL JOIN (
SELECT * from table WHERE name LIKE '%'
AND d_id = 1 AND deleted = false
) fb ON fa.name = fb.name
ORDER BY name;
哪里
id
name
d_id
是用户的id。
d_id
,这意味着它是由系统自动生成的。如果它有
d_id
,则表示它是用户生成的。
我这里的问题是运行查询需要太长时间(在我的本地psql shell上大约30000到40000ms,在live上大约15000)。我已经运行了解释分析并得到了这个
Unique (cost=8240.78..8272.13 rows=2090 width=42) (actual time=27591.662..28742.062 rows=418018 loops=1)
-> Sort (cost=8240.78..8246.01 rows=2090 width=42) (actual time=27591.659..28504.606 rows=418018 loops=1)
Sort Key: (COALESCE(table.name, table_1.name)), (COALESCE(table.id, table_1.id)), (COALESCE(table.d_id, table_1.d_id)), (COALESCE(table.disabled, table_1.disabled)), (COALESCE(table.deleted, table_1.deleted))
Sort Method: external merge Disk: 13680kB
-> Hash Full Join (cost=8.45..8125.53 rows=2090 width=42) (actual time=11.037..1479.053 rows=418018 loops=1)
Hash Cond: (table.name = table_1.name)
-> Seq Scan on table (cost=0.00..8109.23 rows=2090 width=27) (actual time=0.048..799.822 rows=418018 loops=1)
Filter: ((d_id IS NULL) AND (NOT deleted) AND (name ~~ '%'::citext))
-> Hash (cost=8.44..8.44 rows=1 width=27) (actual time=10.970..10.970 rows=0 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Index Scan using table__d_id__name__idx on table table_1 (cost=0.42..8.44 rows=1 width=27) (actual time=10.970..10.970 rows=0 loops=1)
Index Cond: (d_id = 1)
Filter: ((NOT deleted) AND (name ~~ '%'::citext))
虽然我不能完全理解它,但我可以告诉你,它花费太长时间的大部分原因是(
ORDER BY
)功能。
Indexes:
"table_pkey" PRIMARY KEY, btree (id)
"table__d_id__name__idx" UNIQUE, btree (d_id, name)
"table__name__idx" gist (name gist_trgm_ops)
"table__id__idx" btree (id)
我尝试过使用不同的索引、重构查询和处理代码,但仍然需要同样长的时间。我尝试删除除主键索引之外的所有索引,查询不知何故加快了约23000ms。
此外,在应用程序中,用户可以选择一个字母,该字母将返回以该字母开头的所有结果,查询如下所示
WHERE name LIKE 'a%'
. 尽管也有数以万计的结果,但指定起始字母会将加载时间大幅减少到约1000-2000ms。
我的目标是使查询负载在5000到10000ms之间。任何帮助或建议都将不胜感激!