代码之家  ›  专栏  ›  技术社区  ›  Westerlund.io

使用子查询提高CTE的性能

  •  1
  • Westerlund.io  · 技术社区  · 6 年前

    我有一张这样结构的桌子:

    WorkerID    Value           GroupID Sequence    Validity
    1           '20%'           1       1           2018-01-01
    1           '10%'           1       1           2017-06-01
    1           'Yes'           1       2           2017-06-01
    1           '2018-01-01'    2       1           2017-06-01
    1           '17.2'          2       2           2017-06-01
    2           '10%'           1       1           2017-06-01
    2           'No'            1       2           2017-06-01
    2           '2016-03-01'    2       1           2017-06-01
    2           '15.9'          2       2           2017-06-01
    

    创建此结构是为了让客户端可以为工作人员创建自定义数据。例如 Group 1可以是“薪水”和 Sequence 是属于它的一个值 比如“加班费”。“列” Value 是一个 VARCHAR(150) 字段,并在应用程序的另一部分中完成正确的验证和对话。

    这个 Validity 专栏的存在主要是出于历史原因。

    现在,我想为不同的worker显示网格中的信息,其中每一行应该是一个worker(显示具有最新有效性的一个):

    Worker  1_1     1_2     2_1         2_2
    1       20%     Yes     2018-01-01  17.2
    2       10%     No      2016-03-01  15.9
    

    为了实现这一点,我创建了一个如下所示的CTE:

    WITH CTE_worker_grid
        AS
        (
        SELECT
            worker,
    
            /* 1 */
            (
                SELECT top 1 w.Value
                    FROM worker_values AS w
                    WHERE w.GroupID = 1
                    AND w.Sequence = 1
                    ORDER BY w.Validity DESC
            ) AS 1_1,
            (
                SELECT top 1 w.Value
                    FROM worker_values AS w
                    WHERE w.GroupID = 1
                    AND w.Sequence = 2
                    ORDER BY w.Validity DESC
            ) AS 1_2,
    
            /* 2 */
            (
                SELECT top 1 w.Value
                    FROM worker_values AS w
                    WHERE w.GroupID = 2
                    AND w.Sequence = 1
                    ORDER BY w.Validity DESC
            ) AS 2_1,
            (
                SELECT top 1 w.Value
                    FROM worker_values AS w
                    WHERE w.GroupID = 2
                    AND w.Sequence = 2
                    ORDER BY w.Validity DESC
            ) AS 2_2
        )
    GO
    

    这会产生正确的结果,但速度非常慢,因为它为18000多名员工创建了这个网格,其中有近30名员工 Groups 最多20个 Sequences 在每个

    如何才能加快如此规模的CTE进程?是否应该使用CTE?子查询是否可以更改或重新分解以加快执行?

    1 回复  |  直到 6 年前
        1
  •  1
  •   Pittsburgh DBA    6 年前

    使用枢轴!

    +----------+---------+---------+------------+---------+
    | WorkerId | 001_001 | 001_002 |  002_001   | 002_002 |
    +----------+---------+---------+------------+---------+
    |        1 | 20%     | Yes     | 2018-01-01 |    17.2 |
    |        2 | 10%     | No      | 2016-03-01 |    15.9 |
    +----------+---------+---------+------------+---------+
    

    SQL FIDLE: http://sqlfiddle.com/#!18/6e768/1

    CREATE TABLE WorkerAttributes
        (
        WorkerID INT NOT NULL
        , [Value] VARCHAR(50) NOT NULL
        , GroupID INT NOT NULL
        , [Sequence] INT NOT NULL
        , Validity DATE NOT NULL
        )
    
    INSERT INTO WorkerAttributes
        (WorkerID, Value, GroupID, Sequence, Validity)
    VALUES
        (1, '20%', 1, 1, '2018-01-01')
        , (1, '10%', 1, 1, '2017-06-01')
        , (1, 'Yes', 1, 2, '2017-06-01')
        , (1, '2018-01-01', 2, 1, '2017-06-01')
        , (1, '17.2', 2, 2, '2017-06-01')
        , (2, '10%', 1, 1, '2017-06-01')
        , (2, 'No', 1, 2, '2017-06-01')
        , (2, '2016-03-01', 2, 1, '2017-06-01')
        , (2, '15.9', 2, 2, '2017-06-01')
    
    
    ;WITH CTE_WA_RANK
    AS
    (
    SELECT
        ROW_NUMBER() OVER (PARTITION BY WorkerID, GroupID, [Sequence] ORDER BY Validity DESC) AS VersionNumber
        , WA.WorkerID
        , WA.GroupID
        , WA.[Sequence]
        , WA.[Value]
    FROM
        WorkerAttributes AS WA
    ),
    CTE_WA
    AS
    (
    SELECT
        WA_RANK.WorkerID
        , RIGHT('000' + CAST(WA_RANK.GroupID AS VARCHAR(3)), 3)
            + '_'
            + RIGHT('000' + CAST(WA_RANK.[Sequence] AS VARCHAR(3)), 3) AS SMART_KEY
        , WA_RANK.[Value]
    FROM
        CTE_WA_RANK AS WA_RANK
    WHERE
        WA_RANK.VersionNumber = 1
    )
    SELECT
        WorkerId
        , [001_001] AS [001_001]
        , [001_002] AS [001_002]
        , [002_001] AS [002_001]
        , [002_002] AS [002_002]
    FROM
    (
    SELECT
        CTE_WA.WorkerId
        , CTE_WA.SMART_KEY
        , CTE_WA.[Value]
    FROM
        CTE_WA
    ) AS WA
    PIVOT
    (
    MAX([Value])
    FOR
        SMART_KEY IN 
            (
            [001_001]
            , [001_002]
            , [002_001]
            , [002_002]
            )
    ) AS PVT