代码之家  ›  专栏  ›  技术社区  ›  Yossi

填充SQL查询结果集中缺少的(时间段)行

  •  2
  • Yossi  · 技术社区  · 10 年前

    更新:2014年8月15日:根据LMU92的建议/样本,见最后的工作解决方案

    我有一个读取财务交易表的查询。该表是详细事务的汇总,每晚重新生成,仅用于读取/选择。平台为SQL Server 2012。

    此采样是主查询的结果,该查询按时间段、科目和类别返回历史SUM(金额)。报告的时间窗口由参数驱动,对于2014年1月1日至2014年5月31日的样本:

        TimePeriod   Start      End        Category        Account           Amount
        ----------   --------   ---------  ------------    ----------------  ------------
        month        1/1/2014   1/31/2014  CategoryX       AccountA          2421.00
        month        4/1/2014   4/30/2014  CategoryX       AccountA          1421.00 
        month        5/1/2014   5/31/2014  CategoryY       AccountA          9421.00
        month        1/1/2014   1/31/2014  CategoryZ       AccountB          2421.00
        month        3/1/2014   3/31/2014  CategoryZ       AccountB          6421.00
        ...
    

    我所追求的结果是,通过用0.00金额填补任何缺口(无交易),消除缺口,例如,月份/账户A/类别X:

        TimePeriod   Start      End        Category        Account           Amount
        ----------   --------   ---------  ------------    ----------------  ------------
        month        1/1/2014   1/31/2014  CategoryX       AccountA          2421.00
        month        2/1/2014   2/28/2014  CategoryX       AccountA             0.00
        month        3/1/2014   3/31/2014  CategoryX       AccountA             0.00
        month        4/1/2014   4/30/2014  CategoryX       AccountA          1421.00 
        month        5/1/2014   5/31/2014  CategoryX       AccountA             0.00 
    

    挑战在于,汇总是按多个时段类型(日/周/月/季/年)进行的,每个时段类型都可以按科目/类别进行细分。所有时间段的总记录池总计为1000万,随着时间段的分数增加(例如周/天),记录池的数量也会增加。

    我尝试过一个CTE,它表现很差(尽管索引调整,但它似乎处理得很重),还尝试添加0.00条记录,这会使池中的记录数量呈指数级增长,因为每个交易少一个周期(d/w/m/q/y),每个账户(从账户的第一个交易日期开始),每个类别,为了获得如此大的池以提供可接受/接近可接受的性能,需要进行大量的索引调整,并且还增加了执行夜间负载所需的时间。我想做一个立方体,但这对我们正在做的事情来说似乎太过了。

    我正在寻找的解决方案可以在没有日历表的情况下即时完成。我确实有一个日历维度表(如果这是唯一有效的方法)。

    非常感谢您的任何建议。

    DDL/(T-)SQL的简化版本:

    表:

    CREATE TABLE TxnRollups (
        TxnTimePeriod          VARCHAR(10), --Year/Quarter/Month/Week/Day
        TxnPeriodStartDate     DATE,
        TxnPeriodEndDate       DATE,
        TxnAccountID           VARCHAR(10),
        TxnAccountType         VARCHAR(20),
        TxnAccountName         VARCHAR(20),
        TxnAccountHierL1       VARCHAR(20),
        TxnAccountHierL2       VARCHAR(20),       
        TxnAccountHierL3       VARCHAR(20),
        TxnCategory            VARCHAR(20),
        Amount                 DECIMAL(16,3)
    )
    

    查询:

        CREATE PROCEDURE GetTxnByPeriod(@FromDate DATE, @ToDate DATE, @SummaryPeriod VARCHAR(20)) 
          AS
            BEGIN
                    SELECT TxnR.TxnTimePeriod   TimePeriod,
                           TxnR.TxnPeriodStart  Start,
                           TxnR.TxnPeriodEnd    End,
                           TxnR.TxnCategory     Category,
                           TxnR.TxnAccountName  Account,
                           SUM(TxnRAmount)      Amount
                    From TxnRollups TxnR
                    WHERE 
                           TxnR.TxnTImePeriod = @SummaryPeriod AND
                           TxnR.TxnPeriodEnd BETWEEN @FromDate AND @ToDate
                    GROUP BY
                           TxnR.TxnTimePeriod,
                           TxnR.TxnPeriodStart,
                           TxnR.TxnPeriodEnd,
                           TxnR.TxnCategory,
                           TxnR.TxnAccountName
    
        END
    

    更新:2014年8月15日:最终工作解决方案

    概述

    使用以下方法,我能够生成所需的结果集,并且消除了使用汇总的需要,而不是直接查询原始数据集。最重要的是:

    执行时间:1k-1.5k ms 这取决于返回的数据量。我们甚至没有从总结中得到这样的表现。

    结构:

    1. 根据LMU92的最初建议,创建了一个月表,其中包含月的第一天/月的最后一天
    2. 更改了带有“第一/最后一个月”列的交易详细信息原始(交易详细信息)数据集,并更新了它们的内容,使其与每个相应行的日期的月份第一/最后日期相匹配。
    3. 在两个表中都添加了数字(FLOAT)列,并将第一/最后日期表示形式存储为数字。这样做的目的是通过使用浮点数进行数据比较来提高性能。
    4. 已在数字日期列上创建索引。
    5. 在一组唯一的列(帐户ID/类别)和月份表上使用带有交叉联接的CTE。 6 CTE结果至#Temp表。 7在#Temp表上创建了非聚集索引 8从交易明细表中选择一个不同的日期/账户/类别/金额(金额)列表到另一个#Temp表中。 9添加索引 10完整日期/账户表的右外联接(RIGHT OUTER JOIN),以及交易详细信息,结果符合要求(请参见问题)。

    **这是代码的缩写版本(删除了一些调试、非必要的)**

    CREATE PROCEDURE    GetTransactionsByMonth
                        (
                                @FromDate                   DATE                =           NULL        , 
                                @ToDate                     DATE                =           NULL        ,
                                @TxnCategory                VARCHAR(7)          =           'CAT1'      ,
                                @AccountType1               VARCHAR(21)         =           NULL        ,
                                @AccountType2               VARCHAR(21)         =           NULL        ,
                                @AccountType3               VARCHAR(21)         =           NULL        ,
                                @Debug                      BIT                 =           0           
                        ) 
    
    WITH RECOMPILE
    
                AS
                    BEGIN
    
                        DECLARE @True   AS BIT = 1
                        DECLARE @False  AS BIT = 0
    
                        PRINT IIF(@Debug = @True, 'START Procedure - ' + CAST(SYSDATETIMEOFFSET() AS VARCHAR),NULL)
                        IF @Debug = @True
                            BEGIN
                                SET STATISTICS TIME ON
                                SET STATISTICS IO   ON
                                DECLARE @NoCountState int = @@OPTIONS & 512;
                                SET NOCOUNT OFF;
                            END
                        /*================================================================================================================
                            Initialization - Declarations
                          ================================================================================================================*/
                        DECLARE         @MinDateN                       FLOAT
                        DECLARE         @MaxDateN                       FLOAT
                        DECLARE         @MinDateD                       DATE
                        DECLARE         @MaxDateD                       DATE
                        /*================================================================================================================
                            Initialization - Establish Date Ranges
                          ================================================================================================================*/
                        SET @FromDate   =   DATEADD(MM, DATEDIFF(MM, 0, @FromDate), 0)   --Set @FromDate to first of the requested month
                        SET @MinDateN   =   FLOOR(CAST(CAST(@FromDate as DateTime) as float)) 
                        SET @ToDate     =   DATEADD(MM, DATEDIFF(MM, 0, @ToDate), 0)     --Set @ToDate to first of the requested month
                        SET @MaxDateN   =   FLOOR(CAST(CAST(@ToDate as DateTime) as float)) 
                        SET @MinDateD = CAST(FLOOR(CAST(@MinDateN AS FLOAT)) AS DATETIME)  --For output only
                        SET @MaxDateD = CAST(FLOOR(CAST(@MaxDateN AS FLOAT)) AS DATETIME) --For output only
    
                        ; WITH CTE_MonthsRollUp AS (
    
                            SELECT DISTINCT
                                            MonthsTable.NFirstDayOfMonth        AS      PeriodStartN            ,
                                            MonthsTable.NLastDayOfMonth         AS      PeriodEndN              ,
                                            MonthsTable.FirstDayOfMonth         AS      PeriodStartD            ,
                                            MonthsTable.LastDayOfMonth          AS      PeriodEndD              ,
                                            TransactionDetail.AccountId         AS      AccountId           ,
                                            TransactionDetail.AcctCategory      AS      AcctCategory
    
                                      FROM  dbo.Months                          AS      MonthsTable
    
                                CROSS JOIN  dbo.tblTxnDetail                    AS      TransactionDetail
    
                                     WHERE  TransactionDetail.AccountId     IS      NOT NULL
                                     AND    TransactionDetail.AcctCategory       =      @AcctCategory
                                     AND    TransactionDetail.AccountType       IN      (
                                                                                            @AccountType1       ,
                                                                                            @AccountType2       ,
                                                                                            @AccountType3       ,
                                                                                        )
    
                                       AND  MonthsTable.NFirstDayOfMonth        <=      @MaxDateN
                                       AND  MonthsTable.NLastDayOfMonth         >=      @MinDateN
                                       AND  MonthsTable.NLastDayOfMonth         >=      (
                                                                                            SELECT  MIN(NFirstDayOfMonth)           AS  EarliestTxnDateN
                                                                                            FROM    tblTxnDetail        AS  ValidateAccount
                                                                                            WHERE   ValidateAccount.AccountId   =   TransactionDetail.AccountId
                                                                                        )
                            GROUP BY
                                            MonthsTable.NFirstDayOfMonth                                        ,
                                            MonthsTable.NLastDayOfMonth                                         ,
                                            MonthsTable.FirstDayOfMonth                                         ,
                                            MonthsTable.LastDayOfMonth                                          ,
                                            TransactionDetail.AccountId                                     ,
                                            TransactionDetail.AcctCategory
    
                        )
    
                        SELECT      MonthsRollupResults.* 
                          INTO      #TMonthsRollup
                          FROM      CTE_MonthsRollUp MonthsRollupResults
                        OPTION     (RECOMPILE)
    
                        ;           
                        CREATE NONCLUSTERED INDEX   [#idxTMonthsRollup_AccountIDandTxnCategory_Join]
                                               ON   [dbo].[#TMonthsRollup] ([AccountId],[AcctCategory], PeriodStartN)
                        INCLUDE                     (PeriodStartD, PeriodEndD)
                        ;
    
                        ; WITH CTE_AccountList  AS  (
                                                        SELECT DISTINCT     DistinctAccountList.NFirstDayOfMonth        AS      PeriodStartN            ,
                                                                            DistinctAccountList.NLastDayOfMonth         AS      PeriodEndN              ,
                                                                            DistinctAccountList.AccountId               AS      AccountId               , 
                                                                            DistinctAccountList.AcctCategory            AS      AcctCategory            ,
                                                                            DistinctAccountList.AccountType             AS      AccountType             ,
                                                                            DistinctAccountList.Account                 AS      AccountName             ,
                                                                            DistinctAccountList.ACCOUNTL1               AS      AccountHierarchyL1      ,
                                                                            DistinctAccountList.ACCOUNTL2               AS      AccountHierarchyL2      ,
                                                                            DistinctAccountList.ACCOUNTL3               AS      AccountHierarchyL3      ,
                                                                            SUM(DistinctAccountList.Amount)             AS      PeriodAmount        
    
    
                                                                   FROM     tblTxnDetail                                AS      DistinctAccountList
    
                                                                  WHERE     DistinctAccountList.NFirstDayOfMonth        <=      @MaxDateN
                                                                    AND     DistinctAccountList.NLastDayOfMonth         >=      @MinDateN
                                                                    AND     DistinctAccountList.AccountId               IS      NOT NULL
                                                                    AND     DistinctAccountList.AcctCategory            =       @AcctCategory
                                                                    AND     DistinctAccountList.AccountType             IN      (
                                                                                                                                    @AccountType1       ,
                                                                                                                                    @AccountType2       ,
                                                                                                                                    @AccountType3       ,
    
                                                                                                                                )
    
                                                               GROUP BY     DistinctAccountList.NFirstDayOfMonth                                ,
                                                                            DistinctAccountList.NLastDayOfMonth                                 ,
                                                                            DistinctAccountList.AccountId                                       ,
                                                                            DistinctAccountList.AcctCategory                                    ,
                                                                            DistinctAccountList.AccountType                                     ,
                                                                            DistinctAccountList.Account                                         ,
                                                                            DistinctAccountList.ACCOUNTL1                                       ,
                                                                            DistinctAccountList.ACCOUNTL2                                       ,
                                                                            DistinctAccountList.ACCOUNTL3                                       
    
    
                        ) 
    
    
                        SELECT      DistinctAccountList.* 
                          INTO      #TAccountList
                          FROM      CTE_AccountList DistinctAccountList
                        ;
    
                        CREATE NONCLUSTERED INDEX   [#idxTAccountList_DistincAccountDetailsList_For_Join]
                                               ON   [dbo].[#TAccountList] ([AccountId],[AcctCategory], PeriodStartN)
                        INCLUDE                     (AccountName, PeriodCredit, PeriodDebit, PeriodAmount, AccountType, 
                                                     AccountHierarchyL1,AccountHierarchyL2,AccountHierarchyL3,AccountHierarchyL4,
                                                     AccountHierarchyL5, BSReportHierarchyL1, BSReportHierarchyL2, BSReportHierarchyL3,
                                                     BSReportHierarchyL4, PLReportHierarchyL1, PLReportHierarchyL2, PLReportHierarchyL3
                                                    )
                        ;
                        SELECT DISTINCT         'Month'                             AS      Period              ,
                                                @MinDateD                           AS      ReportStart         ,
                                                @MaxDateD                           AS      ReportEnd           ,   
                                                tMonthRollup.AccountId              AS      AccountId           , 
                                                tMonthRollup.AcctCategory           AS      AcctCategory        , 
                                                tMonthRollup.PeriodStartD           AS      PeriodStart         , 
                                                tMonthRollup.PeriodEndD             AS      PeriodEnd           , 
                                                AccountList.AccountName             AS      AccountName         ,
                                                AccountList.PeriodAmount            AS      PeriodAmount        ,
                                                AccountList.AccountType             AS      AccountType         ,
                                                AccountList.AccountHierarchyL1      AS      AccountHierarchyL1  ,
                                                AccountList.AccountHierarchyL2      AS      AccountHierarchyL2  ,
                                                AccountList.AccountHierarchyL3      AS      AccountHierarchyL3  
    
    
                                   FROM         #TAccountList                       AS      AccountList
                       RIGHT OUTER JOIN         #TMonthsRollup                      AS      tMonthRollup
    
                                     ON         AccountList.AccountId           =       tMonthRollup.AccountId
                                    AND         Accountlist.AcctCategory                =       tMonthRollup.AcctCategory
                                    AND         AccountList.PeriodStartN            =       tMonthRollup.PeriodStartN
    
                --------------------------------------------------------
                PRINT IIF(@Debug = @True, 'END PROCEDURE - ' + CAST(SYSDATETIMEOFFSET() AS VARCHAR),NULL)
                --------------------------------------------------------
                IF @Debug = @True
                    BEGIN
                        SET STATISTICS TIME OFF
                        SET STATISTICS IO   OFF
                        IF @NoCountState <> 0 
                            SET NOCOUNT ON
                    END
    END
    
    2 回复  |  直到 10 年前
        1
  •  2
  •   Lmu92    10 年前

    下面是如何执行汇总并将其与日历表合并的编码版本:

                WITH cte_Rollup as
               (SELECT TxnR.TxnTimePeriod   TimePeriod,
                       TxnR.TxnPeriodStart  Start,
                       TxnR.TxnPeriodEnd    End,
                       TxnR.TxnCategory     Category,
                       TxnR.TxnAccountName  Account,
                       SUM(TxnRAmount)      Amount
                From TxnRollups TxnR
                WHERE 
                       TxnR.TxnTImePeriod = @SummaryPeriod AND
                       TxnR.TxnPeriodEnd BETWEEN @FromDate AND @ToDate
                GROUP BY
                       TxnR.TxnTimePeriod,
                       TxnR.TxnPeriodStart,
                       TxnR.TxnPeriodEnd,
                       TxnR.TxnCategory,
                       TxnR.TxnAccountName
                ), cte_Calendar AS
                (
                SELECT cal.Period, Min(cal.PeriodDate) as PeriodStart, Max(cal.PeriodDate) as PeriodEnd
                FROM calendar cal
                WHERE  cal.PeriodDate BETWEEN @FromDate AND @ToDate AND cal.Period = @SummaryPeriod
                GROUP BY cal.TImePeriod,cal.MonthValue
                )
                SELECT *
                FROM cte_Calendar
                LEFT OUTER JOIN cte_Rollup  ON cte_Calendar.PeriodStart = cte_Rollup.TxnPeriodStart
    
        2
  •  0
  •   Lmu92    10 年前

    我建议使用一个索引月份表,其中只有列:开始日期和结束日期。 然后对TxnRollups表执行简单的左外联接。 在月份表中添加聚集索引(StartDate和EndDate)以及TxnPeriodStart和TxnPeriedEnd上的TxnRollups上的非聚集索引将有助于进一步加快速度。