代码之家  ›  专栏  ›  技术社区  ›  user8834780

在每个ID的每一行中创建一个基于逻辑的失效概念

  •  0
  • user8834780  · 技术社区  · 6 年前

    我想去一个 lapsed_date 这是指给定时间内有12周(即84天)以上 ID 之间:

    1) onboarded_at 和当前日期(如果没有 applied_at 存在)-这意味着如果超过84天,现在就失效了

    2) 船上的 和闽( 应用于 )(如果有)

    3)每个连续 应用于

    4)最大值(max) 应用于 ) current_date -这意味着如果超过84天,现在就失效了。

    如果有多个他失效的实例,那么我们只显示最近的失效日期。

    我的尝试在大多数情况下都有效,但并非所有情况下都有效。你能协助使它普遍工作吗?

    样本集:

    CREATE TABLE #t
    (
      id VARCHAR(10),
      rank INTEGER,
      onboarded_at DATE,
      applied_at DATE
      );
    
    INSERT INTO #t VALUES
    ('A',1,'20180101','20180402'),
    ('A',2,'20180101','20180403'),
    ('A',3,'20180101','20180504'),
    ('B',1,'20180201','20180801'),
    ('C',1,'20180301','20180401'),
    ('C',2,'20180301','20180501'),
    ('C',3,'20180301','20180901'),
    ('D',1,'20180401',null)
    

    最佳尝试:

    SELECT onb.id,
    onb.rank,
    onb.onboarded_at,
    onb.applied_at,
    onb.lapsed_now,
    CASE WHEN lapsed_now = 1 OR lapsed_previous = 1
        THEN 1
        ELSE 0
    END lapsed_ever,
    CASE WHEN lapsed_now = 1
        THEN DATEADD(DAY, 84, lapsed_now_date)
        ELSE min_applied_at_add_84
    END lapsed_date
    FROM
    (SELECT *,
    CASE
        WHEN DATEDIFF(DAY, onboarded_at, MIN(ISNULL(applied_at, onboarded_at)) over (PARTITION BY id)) >= 84
            THEN 1
        WHEN DATEDIFF(DAY, MAX(applied_at) OVER (PARTITION BY id), GETDATE()) >= 84
            THEN 1
        ELSE 0
    END lapsed_now,
    CASE
        WHEN MAX(DATEDIFF(DAY, onboarded_at, ISNULL(applied_at, GETDATE()))) OVER (PARTITION BY id) >= 84
            THEN 1
        ELSE 0
    END lapsed_previous,
    MAX(applied_at) OVER (PARTITION BY id) lapsed_now_date,
    DATEADD(DAY, 84, MIN(CASE WHEN applied_at IS NULL THEN onboarded_at ELSE applied_at END) OVER (PARTITION BY id)) min_applied_at_add_84
    FROM #t
    ) onb
    

    当前解决方案:

    id  rank    onboarded_at    applied_at  lapsed_now  lapsed_ever lapsed_date
    A   1       2018-01-01      2018-04-02  1           1           2018-07-27
    A   2       2018-01-01      2018-04-03  1           1           2018-07-27
    A   3       2018-01-01      2018-05-04  1           1           2018-07-27
    B   2       2018-02-01      2018-08-01  1           1           2018-10-24
    C   1       2018-03-01      2018-04-01  0           1           2018-06-24
    C   2       2018-03-01      2018-05-01  0           1           2018-06-24
    C   3       2018-03-01      2018-09-01  0           1           2018-06-24
    D   1       2018-04-01      null        1           1           2018-06-24
    

    预期解决方案:

    id  rank    onboarded_at    applied_at  lapsed_now  lapsed_ever lapsed_date
    A   1       2018-01-01      2018-04-02   1           1         2018-07-27 (not max lapsed date)
    A   2       2018-01-01      2018-04-03   1           1         2018-07-27
    A   3       2018-01-01      2018-05-04   1           1         2018-07-27 (May 4 + 84)
    B   1       2018-02-01      2018-08-01   0           1         2018-04-26 (Feb 1 + 84)
    C   1       2018-03-01      2018-04-01   0           1         2018-07-24 
    C   2       2018-03-01      2018-05-01   0           1         2018-07-24 (May 1 + 84)
    C   3       2018-03-01      2018-09-01   0           1         2018-07-24 
    D   1       2018-04-01      null         1           1         2018-06-24
    
    2 回复  |  直到 6 年前
        1
  •  2
  •   Jim Jimson    6 年前

    这里有点猜测,但希望这能解决问题:

    SELECT res.id,
    res.rank,
    res.onboarded_at,
    res.applied_at,
    res.lapsed_now,
    CASE WHEN lapsed_now = 1 OR lapsed_previous = 1
        THEN 1
        ELSE 0
    END lapsed_ever,
    CASE
      WHEN lapsed_now = 1
        THEN DATEADD(DAY, 84, lapsed_now_date)
      WHEN applied_difference_gt84 IS NOT NULL
        THEN DATEADD(DAY, 84, applied_difference_gt84)
      WHEN DATEDIFF(DAY, min_applied_at_add_84, GETDATE()) < 84
        THEN DATEADD(DAY, 84, onboarded_at)
        ELSE min_applied_at_add_84
    END lapsed_date
    FROM (
    SELECT *, MAX(applied_difference) OVER (PARTITION BY id ORDER BY rank ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) applied_difference_gt84
    FROM
    (
    SELECT *,
    CASE
        WHEN DATEDIFF(DAY, onboarded_at, MIN(ISNULL(applied_at, onboarded_at)) over (PARTITION BY id)) >= 84
              AND DATEDIFF(DAY, MAX(applied_at) OVER (PARTITION BY id), GETDATE()) >= 84
            THEN 1
        WHEN DATEDIFF(DAY, ISNULL(MAX(applied_at) OVER (PARTITION BY id), onboarded_at), GETDATE()) >= 84
            THEN 1
        ELSE 0
    END lapsed_now,
    CASE
        WHEN MAX(DATEDIFF(DAY, onboarded_at, ISNULL(applied_at, GETDATE()))) OVER (PARTITION BY id) >= 84
            THEN 1
        ELSE 0
    END lapsed_previous,
     CASE
      WHEN DATEDIFF(MONTH, applied_at, LEAD(applied_at, 1) OVER (PARTITION BY id ORDER BY rank)) >= 2
       THEN applied_at
     ELSE NULL
     END applied_difference,
    ISNULL(MAX(applied_at) OVER (PARTITION BY id), onboarded_at) lapsed_now_date,
    DATEADD(DAY, 84, MIN(CASE WHEN applied_at IS NULL THEN onboarded_at ELSE applied_at END) OVER (PARTITION BY id)) min_applied_at_add_84
    FROM #t
    ) onb
      ) res
    

    结果:

    id  rank    onboarded_at    applied_at  lapsed_now  lapsed_ever lapsed_date
    A   1       2018-01-01      2018-04-02  1           1           2018-07-27
    A   2       2018-01-01      2018-04-03  1           1           2018-07-27
    A   3       2018-01-01      2018-05-04  1           1           2018-07-27
    B   1       2018-02-01      2018-08-01  0           1           2018-04-26
    C   1       2018-03-01      2018-04-01  0           1           2018-07-24
    C   2       2018-03-01      2018-05-01  0           1           2018-07-24
    C   3       2018-03-01      2018-09-01  0           1           2018-07-24
    D   1       2018-04-01      (null)      1           1           2018-06-24
    

    这有点混乱,因为需要计算日期应用程序之间的差异。

        2
  •  1
  •   user8834780    6 年前

    @吉姆,在你的回答的启发下,我创建了以下解决方案。 我认为这很容易理解和直观,知道失效的标准:

    SELECT id, onboarded_at, applied_at, 
    max(case when (zero_applicants is not null and current_date - onboarded_at > 84) or (last_applicant is not null and current_date - last_applicant > 84) then 1 else 0 end) over (partition by id) lapsed_now,
    max(case when (zero_applicants is not null and current_date - onboarded_at > 84) or (one_applicant is not null and applied_at - onboarded_at > 84)
         or (one_applicant is not null and current_date - applied_at > 84) or (next_applicant is not null and next_applicant- applied_at > 84)
         or (last_applicant is not null and current_date - last_applicant > 84) then 1 else 0 end) over(partition by id) lapsed_ever,
    max(case when zero_applicants is not null and current_date - onboarded_at > 84 then onboarded_at + 84 
         when one_applicant is not null and applied_at - onboarded_at > 84 then onboarded_at + 84 
         when one_applicant is not null and current_date - applied_at > 84 then applied_at + 84 
         when next_applicant is not null and next_applicant - applied_at > 84 then applied_at + 84 
         when last_applicant is not null and current_date - last_applicant > 84 then last_applicant + 84 
         end) over (partition by id) lapsed_date
    from (
    select *, 
    case when MAX(applied_at) OVER (PARTITION BY id) is null then onboarded_at end as zero_applicants,
    case when count(applied_at) over(partition by id)=1 then onboarded_at end as one_applicant,
    case when count(applied_at) over(partition by id)>1 then LEAD(applied_at, 1) OVER (PARTITION BY id ORDER BY applied_at) end as next_applicant,
    case when LEAD(applied_at, 1) OVER (PARTITION BY id ORDER BY applied_at) is null then MAX(applied_at) over(partition by id) end as last_applicant
    from #t
    ) res
    order by id, applied_at