代码之家  ›  专栏  ›  技术社区  ›  PJD

在记录组中循环

  •  0
  • PJD  · 技术社区  · 6 年前

    在SQL Server 2014中,我有一个包含许多行的表,例如15,5有一个groupid列736881,10有一个groupid列3084235。我要做的是依次处理每组记录并将结果加载到表中。

    我已经写了代码来做这件事,但我认为我没有设置的loopcounter错误设置,因为我不断得到记录736881的groupid加载两次。

    SELECT @LoopCounter = min(rowfilter) , @maxrowfilter = max(rowfilter) 
    FROM peops6
    
    WHILE ( @LoopCounter IS NOT NULL
            AND  @LoopCounter <= @maxrowfilter)
    
    begin
    
    declare @customer_dist as Table (
        [id] [int] NOT NULL,
        [First_Name] [varchar](50) NULL,
        [Last_Name] [varchar](50) NULL,
        [DoB] [date] NULL,
        [post_code] [varchar](50) NULL,
        [mobile] [varchar](50) NULL,
        [Email] [varchar](100) NULL );
    
    
    INSERT INTO @customer_dist (id, First_Name, Last_Name, DoB, post_code, mobile, Email)
    select id, first_name, last_name, dob, postcode, mobile_phone, email  from peops6 where rowfilter = @LoopCounter
    
    insert into results
    SELECT result.* ,
           [dbo].GetPercentageOfTwoStringMatching(result.DoB, d.DoB) [DOB%match] ,
           [dbo].GetPercentageOfTwoStringMatching(result.post_code, d.post_code) [post_code%match] ,
           [dbo].GetPercentageOfTwoStringMatching(result.mobile, d.mobile) [mobile%match] ,
           [dbo].GetPercentageOfTwoStringMatching(result.Email, d.Email) [email%match]
     FROM   (   SELECT (   SELECT MIN(id)
                          FROM   @customer_dist AS sq
                          WHERE  sq.First_Name = cd.First_Name
                                 AND sq.Last_Name = cd.Last_Name
                                 AND (   sq.DoB = cd.DoB  
                                         OR sq.mobile = cd.mobile
                                         OR sq.Email = cd.Email
                                         OR sq.post_code = cd.post_code )) nid ,
                      *
               FROM   @customer_dist AS cd ) AS result
           INNER JOIN @customer_dist d ON result.nid = d.id order by 1, 2 asc;
    
    SELECT @LoopCounter  = min(rowfilter) FROM peops6
       WHERE rowfilter > @LoopCounter
    
    end 
    
    2 回复  |  直到 6 年前
        1
  •  0
  •   Alex    6 年前

    您需要在循环结束时截断表变量(@customer\u dist):

    ....
    -- Add this
    TRUNCATE TABLE @customer_dist
    
    SELECT @LoopCounter  = min(rowfilter) FROM peops6
       WHERE rowfilter > @LoopCounter
    
    end
    

    请参见: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/42ef20dc-7ad8-44f7-b676-a4596fc0d593/declaring-a-table-variable-inside-a-loop-does-not-delete-the-previous-data?forum=transactsql

        2
  •  0
  •   Eralper    6 年前

    我不确定你是否需要像使用 SQL Cursor 完成这项任务

    请检查下面的SQL语句,其中我使用了多个CTE表达式

    with customer_dist as (
        select
            rowfilter,
            id, first_name, last_name, dob, postcode, mobile_phone, email
        from peops6 
    ), result as (
        SELECT
            (
            SELECT
                MIN(id)
            FROM customer_dist AS sq
            WHERE 
                sq.rowfilter  = cd.rowfilter
            AND sq.First_Name = cd.First_Name
            AND sq.Last_Name  = cd.Last_Name
            AND (sq.DoB = cd.DoB OR sq.mobile_phone = cd.mobile_phone OR sq.Email = cd.Email OR sq.postcode = cd.postcode )
            ) nid,
            *
        FROM customer_dist AS cd
    )
    SELECT 
        result.* ,
        [dbo].edit_distance(result.DoB, d.DoB) [DOB%match] ,
        [dbo].edit_distance(result.postcode, d.postcode) [post_code%match] ,
        [dbo].edit_distance(result.mobile_phone, d.mobile_phone) [mobile%match] ,
        [dbo].edit_distance(result.Email, d.Email) [email%match]
    FROM result
    INNER JOIN customer_dist d 
        ON result.nid = d.id 
    order by 1, 2 asc;
    

    请注意,我使用了模糊字符串匹配 Levenshtein Distance Algorithm

    enter image description here

    只需要在最后一条SELECT语句之前添加INSERT语句

    希望有用