代码之家  ›  专栏  ›  技术社区  ›  Liv

我是否可以在SQL Server 2016中改进此JSON转换的性能?

  •  4
  • Liv  · 技术社区  · 6 年前

    我有一张类似下面的桌子 (以下代码创建了一个名为#Temp的表。该表有160000行,这与我在真实数据集中处理的行数大致相同,但真实数据集中有更多列):

    /* Create dummy employees*/
    
    ;WITH employeeNumbers
    AS ( SELECT 1 AS employeeId
         UNION ALL
         SELECT employeeNumbers.employeeId + 1
         FROM   employeeNumbers
         WHERE  employeeNumbers.employeeId < 16000 )
    SELECT *
    INTO   #employeeId
    FROM   employeeNumbers
    OPTION ( MAXRECURSION 16000 )
    
    
    /*Create saleItems*/
    CREATE TABLE #SalesItems
        (
            category VARCHAR(100)
          , subCategory VARCHAR(100)
          , productName VARCHAR(1000)
        )
    INSERT INTO #SalesItems ( category
                            , subCategory
                            , productName )
    VALUES ( 'Furniture', 'Bookcases', 'Bush Somerset Collection Bookcase' )
         , ( 'Furniture', 'Chairs', 'Hon Deluxe Fabric Upholstered Stacking Chairs, Rounded Back' )
         , ( 'Office Supplies', 'Labels', 'Self-Adhesive Address Labels for Typewriters by Universal' )
         , ( 'Furniture', 'Tables', 'Bretford CR4500 Series Slim Rectangular Table' )
         , ( 'Office Supplies', 'Storage', 'Eldon Fold n Roll Cart System' )
         , ( 'Furniture', 'Furnishings', 'Eldon Expressions Wood and Plastic Desk Accessories, Cherry Wood' )
         , ( 'Office Supplies', 'Art', 'Newell 322' )
         , ( 'Technology', 'Phones', 'Mitel 5320 IP Phone VoIP phone' )
         , ( 'Office Supplies', 'Binders', 'DXL Angle-View Binders with Locking Rings by Samsill' )
         , ( 'Technology', 'Phones', 'Samsung Galaxy S8' )
    
    -- Create some random sales figures between 10 and 100
    SELECT employeeId
         , category
         , subCategory
         , productName
         , CONVERT(DECIMAL(13, 2), 10 + ( 100 - 10 ) * RAND(CHECKSUM(NEWID()))) [Jul 2017]
         , CONVERT(DECIMAL(13, 2), 10 + ( 100 - 10 ) * RAND(CHECKSUM(NEWID()))) [Aug 2017]
         , CONVERT(DECIMAL(13, 2), 10 + ( 100 - 10 ) * RAND(CHECKSUM(NEWID()))) [Sep 2017]
         , CONVERT(DECIMAL(13, 2), 10 + ( 100 - 10 ) * RAND(CHECKSUM(NEWID()))) [Oct 2017]
         , CONVERT(DECIMAL(13, 2), 10 + ( 100 - 10 ) * RAND(CHECKSUM(NEWID()))) [Nov 2017]
         , CONVERT(DECIMAL(13, 2), 10 + ( 100 - 10 ) * RAND(CHECKSUM(NEWID()))) [Dec 2017]
    INTO   #Temp
    FROM   #employeeId
    JOIN   #SalesItems ON 1 = 1
    
    CREATE INDEX empId
        ON #Temp ( employeeId )
    
    SELECT *
    FROM   #Temp
    

    我所做的是将这些结果转换为表中每个员工id的单个json字符串。我的问题如下:

    SELECT DISTINCT x.employeeId
                  , (   SELECT y.category
                             , y.subCategory
                             , y.productName
                             , [Jul 2017] AS 'salesAmounts.Jul 2017'
                             , [Aug 2017] AS 'salesAmounts.Aug 2017'
                             , [Sep 2017] AS 'salesAmounts.Sep 2017'
                             , [Oct 2017] AS 'salesAmounts.Oct 2017'
                             , [Nov 2017] AS 'salesAmounts.Nov 2017'
                             , [Dec 2017] AS 'salesAmounts.Dec 2017'
                        FROM   #Temp y
                        WHERE  y.employeeId = x.employeeId
                        FOR JSON PATH, INCLUDE_NULL_VALUES ) data
    FROM   #Temp x
    

    这是可行的,但它的性能不是很好。在本示例中,执行此操作需要25秒,但在我的真实数据集中,这需要更长的时间。从#Temp表返回所有结果需要1秒。 我是否可以在此处重新设计查询以缩短查询时间?我确实尝试过使用光标遍历每个employeeId并以这种方式生成json字符串,但它仍然很糟糕。

    1 回复  |  直到 6 年前
        1
  •  7
  •   Gottfried Lesigang    6 年前

    Read "Performance Surprises and Assumptions : GROUP BY vs. DISTINCT" by Aaron Bertrand

    尝试使用 GROUP BY 而不是 DISTINCT . 独特的 丢弃重复项 之后 结果集已经创建,因此调用JSON的频率超过了需要。 分组依据 应首先将集合减少为distinct employeeId 值,并对每个值仅执行一次JSON。

    目前无法对其进行测试,但这也应该进行同样的测试,只是速度更快:

    SELECT x.employeeId
                  , (   SELECT y.category
                             , y.subCategory
                             , y.productName
                             , [Jul 2017] AS 'salesAmounts.Jul 2017'
                             , [Aug 2017] AS 'salesAmounts.Aug 2017'
                             , [Sep 2017] AS 'salesAmounts.Sep 2017'
                             , [Oct 2017] AS 'salesAmounts.Oct 2017'
                             , [Nov 2017] AS 'salesAmounts.Nov 2017'
                             , [Dec 2017] AS 'salesAmounts.Dec 2017'
                        FROM   #Temp y
                        WHERE  y.employeeId = x.employeeId
                        FOR JSON PATH, INCLUDE_NULL_VALUES ) data
    FROM   #Temp x
    GROUP BY x.EmployeeId