代码之家  ›  专栏  ›  技术社区  ›  Lei Chi

Greenplum查询成本与查询分析不匹配

  •  0
  • Lei Chi  · 技术社区  · 6 年前

    Greenplum查询的实际成本在哪里? 我有一个简单的SQL查询,它在SQL Server上返回80毫秒,在Greenplum上返回1500毫秒!!!!

    “explain analyze”显示“总执行时间:55ms”,但这不是真的,我尝试使用odbc dirver或psql客户端连接到(greenplum)数据库,查询成本稳定在1500ms。

    每个表上的数据不超过10 K, 分区,无歪斜 (无索引)。所有节点(主/段)CPU/IO/MEM/Metwork平均值低于10%。

    问题:

    • 真正的费用在哪
    • 在哪能找到它
    • 我找不到钥匙了吗

    SQL:

        SELECT CoreHR_EmploymentRecord.ChangeTypeOID AS CoreHR_EmploymentRecord_ChangeTypeOID,
           MAX(CoreHR_EmploymentRecord.ChangeTypeAlias) AS CoreHR_EmploymentRecord_ChangeTypeAlias,
           COUNT(DISTINCT CoreHR_EmployeeInformation.UserID) AS CoreHR_EmployeeInformation_Count_UserID
    FROM CoreHR_EmployeeInformation WITH (NOLOCK)
        LEFT JOIN CoreHR_EmploymentRecord WITH (NOLOCK)
            ON CoreHR_EmployeeInformation.TenantId = CoreHR_EmploymentRecord.TenantId
               AND CoreHR_EmployeeInformation.UserID = CoreHR_EmploymentRecord.UserID
               AND CoreHR_EmploymentRecord.TenantId = 106996
    WHERE CoreHR_EmployeeInformation.TenantId = 106996
          AND (CoreHR_EmploymentRecord.IsCurrentRecord = 1)
          AND (CoreHR_EmploymentRecord.StartDate
          BETWEEN '2018-05-24 00:00:00' AND '2018-06-22 23:59:59.998'
              )
          AND CoreHR_EmployeeInformation.ApprovalStatus IN
              (
                  SELECT * FROM dbo.f_SplitToVarchar(4, ',')
              )
    GROUP BY CoreHR_EmploymentRecord.ChangeTypeOID
    ORDER BY CoreHR_EmploymentRecord.ChangeTypeOID ASC;
    

    执行计划(无索引):

    "Gather Motion 3:1  (slice4; segments: 3)  (cost=0.00..865.47 rows=18 width=18)"
    "  Merge Key: record.changetypeoid"
    "  Rows out:  8 rows at destination with 13 ms to end."
    "  ->  GroupAggregate  (cost=0.00..865.47 rows=6 width=18)"
    "        Group By: record.changetypeoid"
    "        Rows out:  Avg 2.7 rows x 3 workers.  Max 4 rows (seg0) with 11 ms to end."
    "        Executor memory:  8K bytes avg, 8K bytes max (seg0)."
    "        ->  Sort  (cost=0.00..865.47 rows=6 width=18)"
    "              Sort Key: record.changetypeoid"
    "              Sort Method:  quicksort  Max Memory: 33KB  Avg Memory: 33KB (3 segments)"
    "              Rows out:  Avg 6.7 rows x 3 workers.  Max 12 rows (seg0) with 11 ms to end."
    "              Executor memory:  58K bytes avg, 58K bytes max (seg0)."
    "              Work_mem used:  58K bytes avg, 58K bytes max (seg0). Workfile: (0 spilling)"
    "              ->  Redistribute Motion 3:3  (slice3; segments: 3)  (cost=0.00..865.47 rows=6 width=18)"
    "                    Hash Key: record.changetypeoid"
    "                    Rows out:  Avg 6.7 rows x 3 workers at destination.  Max 12 rows (seg0) with 10 ms to end."
    "                    ->  Result  (cost=0.00..865.47 rows=6 width=18)"
    "                          Rows out:  Avg 6.7 rows x 3 workers.  Max 7 rows (seg1) with 7.992 ms to first row, 8.138 ms to end."
    "                          ->  GroupAggregate  (cost=0.00..865.47 rows=6 width=18)"
    "                                Group By: record.changetypeoid"
    "                                Rows out:  Avg 6.7 rows x 3 workers.  Max 7 rows (seg1) with 7.990 ms to first row, 8.131 ms to end."
    "                                Executor memory:  248K bytes avg, 258K bytes max (seg1)."
    "                                Work_mem used:  58K bytes avg, 58K bytes max (seg0)."
    "                                ->  Sort  (cost=0.00..865.47 rows=41 width=18)"
    "                                      Sort Key: record.changetypeoid"
    "                                      Sort Method:  quicksort  Max Memory: 49KB  Avg Memory: 38KB (3 segments)"
    "                                      Rows out:  Avg 48.7 rows x 3 workers.  Max 63 rows (seg2) with 7.598 ms to first row, 7.603 ms to end."
    "                                      Executor memory:  58K bytes avg, 58K bytes max (seg0)."
    "                                      Work_mem used:  58K bytes avg, 58K bytes max (seg0). Workfile: (0 spilling)"
    "                                      ->  Redistribute Motion 3:3  (slice2; segments: 3)  (cost=0.00..865.45 rows=41 width=18)"
    "                                            Hash Key: info.userid"
    "                                            Rows out:  Avg 48.7 rows x 3 workers at destination.  Max 63 rows (seg2) with 7.178 ms to first row, 7.193 ms to end."
    "                                            ->  Hash Join  (cost=0.00..865.44 rows=41 width=18)"
    "                                                  Hash Cond: info.tenantid = record.tenantid AND info.userid = record.userid"
    "                                                  Rows out:  Avg 48.7 rows x 3 workers.  Max 50 rows (seg1) with 4.240 ms to first row, 4.995 ms to end."
    "                                                  Executor memory:  7K bytes avg, 7K bytes max (seg0)."
    "                                                  Work_mem used:  7K bytes avg, 7K bytes max (seg0). Workfile: (0 spilling)"
    "                                                  (seg1)   Hash chain length 1.0 avg, 2 max, using 141 of 131072 buckets."
    "                                                  ->  Dynamic Table Scan on info (dynamic scan id: 1)  (cost=0.00..431.94 rows=258 width=10)"
    "                                                        Filter: tenantid = 106996 AND approvalstatus::text = '4'::text"
    "                                                        Rows out:  Avg 772.7 rows x 3 workers.  Max 783 rows (seg2) with 1.668 ms to first row, 1.836 ms to end."
    "                                                        Partitions scanned:  Avg 1.0 (out of 286) x 3 workers.  Max 1 parts (seg0)."
    "                                                  ->  Hash  (cost=100.00..100.00 rows=34 width=4)"
    "                                                        Rows in:  Avg 146.0 rows x 3 workers.  Max 146 rows (seg0) with 5.202 ms to end, start offset by 34 ms."
    "                                                        ->  Partition Selector for info (dynamic scan id: 1)  (cost=10.00..100.00 rows=34 width=4)"
    "                                                              Rows out:  Avg 146.0 rows x 3 workers.  Max 146 rows (seg0) with 4.725 ms to first row, 5.104 ms to end."
    "                                                              ->  Broadcast Motion 3:3  (slice1; segments: 3)  (cost=0.00..433.31 rows=144 width=22)"
    "                                                                    Rows out:  Avg 146.0 rows x 3 workers at destination.  Max 146 rows (seg0) with 3.820 ms to first row, 3.934 ms to end."
    "                                                                    ->  Result  (cost=0.00..433.25 rows=48 width=22)"
    "                                                                          Rows out:  Avg 48.7 rows x 3 workers.  Max 53 rows (seg2) with 4.080 ms to first row, 4.593 ms to end."
    "                                                                          ->  Sequence  (cost=0.00..433.25 rows=48 width=30)"
    "                                                                                Rows out:  Avg 48.7 rows x 3 workers.  Max 53 rows (seg2) with 4.078 ms to first row, 4.582 ms to end."
    "                                                                                ->  Partition Selector for record (dynamic scan id: 2)  (cost=10.00..100.00 rows=34 width=4)"
    "                                                                                      Partitions selected: 1 (out of 128)"
    "                                                                                      Rows out:  0 rows (seg0) with 0.018 ms to end."
    "                                                                                ->  Dynamic Table Scan on record (dynamic scan id: 2)  (cost=0.00..433.25 rows=48 width=30)"
    "                                                                                      Filter: startdate >= '2018-05-24'::date AND startdate <= '2018-06-22'::date AND tenantid = 106996 AND iscurrentrecord = 1"
    "                                                                                      Rows out:  Avg 48.7 rows x 3 workers.  Max 53 rows (seg2) with 4.060 ms to first row, 4.542 ms to end."
    "                                                                                      Partitions scanned:  Avg 1.0 (out of 128) x 3 workers.  Max 1 parts (seg0)."
    "Slice statistics:"
    "  (slice0)    Executor memory: 880K bytes."
    "  (slice1)    Executor memory: 1070K bytes avg x 3 workers, 1070K bytes max (seg0)."
    "  (slice2)    Executor memory: 2844K bytes avg x 3 workers, 2844K bytes max (seg0).  Work_mem: 7K bytes max."
    "  (slice3)    Executor memory: 282K bytes avg x 3 workers, 293K bytes max (seg2).  Work_mem: 58K bytes max."
    "  (slice4)    Executor memory: 219K bytes avg x 3 workers, 219K bytes max (seg0).  Work_mem: 58K bytes max."
    "Statement statistics:"
    "  Memory used: 50176K bytes"
    "Optimizer status: PQO version 2.62.0"
    "Total runtime: 51.404 ms"
    
    1 回复  |  直到 6 年前
        1
  •  1
  •   Brendan Stephens    6 年前

    Greenplum数据库执行非常快速的顺序扫描;索引使用随机搜索模式来定位磁盘上的记录。

    Greenplum数据分布在各个数据段中,因此每个数据段扫描整个数据的较小部分以获得结果。对于表分区,要扫描的总数据可能更小。

    因为商业智能查询工作负载通常返回非常大的数据集,所以使用索引是不高效的。

    首先,在不添加索引的情况下尝试查询工作负载。

    你用的是ORCA还是传统计划?

    关于时间问题--

    您是从哪里生成时间的?远程客户端?还是从PSQL控制台? 解释分析显示查询在飞行中的实际处理时间。

    您需要发布一个解释分析,以便对计划进行任何真正的分析。