代码之家  ›  专栏  ›  技术社区  ›  Salman Arshad

确定一个范围是否完全被一组范围覆盖

  •  2
  • Salman Arshad  · 技术社区  · 6 年前

    如何检查范围是否正确 完全覆盖 通过一系列的范围。在以下示例中:

    WITH ranges(id, a, b) AS (
        SELECT 1,  0, 40 UNION
        SELECT 2, 40, 60 UNION
        SELECT 3, 80, 100 UNION
        SELECT 4, 10, 30
    ), tests(id, a, b) AS (
        SELECT 1, 10, 90 UNION
        SELECT 2, 10, 60
    )
    SELECT *
    FROM tests
    WHERE -- ?
    
    • 10, 60 因为所有这些都在 0, 40 40, 60 (和 10, 30
    • 我想排除 10, 90 因为它暴露在 60, 80

    假设 a 包容性强 b 是独占的,即值 40 属于 [40, 60) 而不是 [0, 40) . 范围可以包含间隙和所有类型的重叠。

    实际问题涉及日期+时间数据,但日期只是数字。我使用的是SQL server,但首选通用解决方案。

    4 回复  |  直到 6 年前
        1
  •  1
  •   Thorsten Kettner    6 年前

    您需要一个递归查询来查找实际范围(在您的例子中是0到60和80到100)。我们从给定的范围开始,寻找扩展这些范围的范围。最后,我们坚持使用最宽的范围(例如,范围10到30可以扩展到0到40,然后再扩展到0到60,所以我们保持最宽的范围0到60)。

    with wider_ranges(a, b, grp) as
    (
      select a, b, id from ranges
      union all
      select
        case when r1.a < r2.a then r1.a else r2.a end,
        case when r1.b > r2.b then r1.b else r2.b end,
        r1.grp
      from wider_ranges r1
      join ranges r2 on (r2.a < r1.a and r2.b >= r1.a)
                     or (r2.b > r1.b and r2.a <= r1.b)
    )
    , real_ranges(a, b) as
    (
      select distinct min(a), max(b)
      from wider_ranges
      group by grp
    )
    select * 
    from tests
    where exists
    (
      select *
      from real_ranges
      where tests.a >= real_ranges.a and tests.b <= real_ranges.b
    );
    

    Rextester演示: http://rextester.com/BDJA16583

    根据要求,这可以在SQLServer中工作,但是是标准的SQL,因此它应该可以在几乎所有具有递归查询的DBMS中工作。

        2
  •  2
  •   GreyOrGray    6 年前

    WITH ranges(id, a, b) AS (
        SELECT 1,  0, 40 UNION
        SELECT 2, 40, 60 UNION
        SELECT 3, 80, 100 UNION
        SELECT 4, 10, 30 
    ), tests(id, a, b) AS
    (   
            SELECT 1 as id, 10 as a, 90 as b
            UNION
            SELECT 2, 10, 60
    ), rangeFinder(a, b, ra, rfb) AS
    (
        SELECT a, b, 0 AS ra, 0 AS rfb 
        FROM ranges AS r
        UNION ALL
        SELECT rangeFinder.a, ranges.b, ranges.a, rangeFinder.b 
        FROM ranges 
        JOIN rangeFinder
            ON ranges.b > rangeFinder.b
            AND ranges.a <=rangeFinder.b
    ), islands(a, b) AS
    (
        SELECT a, b 
        FROM rangeFinder
        WHERE a NOT IN (SELECT ra FROM rangeFinder)
            AND b NOT IN (SELECT rfb FROM rangeFinder)
    )
    SELECT t.id, t.a, t.b FROM 
    tests t
    JOIN islands i
    ON t.a >= i.a
    AND t.b <= i.b
    

    此处演示: http://rextester.com/HDQ52126

        3
  •  0
  •   Gordon Linoff    6 年前

    • 获取不在范围内的所有点的列表。这是所有范围的开始和结束。
    • 检查是否有任何不在范围内。

    这是基于不在某个范围内的点将是任一表中包含的数字之一的操作:

    with tc as (
          select t.test, r.candidate
          from tests t join
               (select r.a as candidate from ranges union all
                select r.b from ranges
               ) r
               on r.candiate >= t.a and r.candidate < t.b
          union all
          select t.test, t.a
          from tests t
          union all
          select t.test, t.b
          from tests t
         )
    select distinct tc.test
    from tc
    where not exists (select 1
                      from ranges r
                      where tc.candidate >= r.a and tc.candidate < r.b
                     );
    

    因为范围包括第一项,所以实际上不必检查它。因此候选人名单可以减少:

    with tc as (
          select t.test, r.candidate
          from tests t join
               (select r.b as candidate from ranges
               ) r
               on r.candidate >= t.a and r.r < t.b
          union all
          select t.test, t.a
          from tests t
          union all
          select t.test, t.b
          from tests t
         )
    
        4
  •  0
  •   Salman Arshad    6 年前

    如接受答案中所述,解决方案是将重叠范围合并在一起,然后确定测试范围是否存在于其中一个合并范围内。

    除了连接和/或递归之外,还可以使用 sorting approach 使用窗口函数合并重叠范围:

    WITH ranges(id, a, b) AS (
        SELECT 1,  0, 40 UNION
        SELECT 2, 40, 60 UNION
        SELECT 3, 80, 100 UNION
        SELECT 4, 10, 30
    ), tests(id, a, b) AS (
        SELECT 1, 10, 90 UNION
        SELECT 2, 10, 60
    ), ranges_chg AS (
        SELECT *, CASE WHEN MAX(b) OVER (ORDER BY a ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) >= a THEN 0 ELSE 1 END AS chg
        FROM ranges
    ), ranges_grp AS(
        SELECT *, SUM(chg) OVER (ORDER BY a) AS grp
        FROM ranges_chg
    ), merged_ranges AS (
        SELECT MIN(a) AS a, MAX(b) AS b
        FROM ranges_grp
        GROUP BY grp
    )
    SELECT *
    FROM tests
    WHERE EXISTS (
        SELECT 1
        FROM merged_ranges
        WHERE merged_ranges.a <= tests.a AND tests.b <= merged_ranges.b
    )
    

    结果和 Fiddle

    | id | a  | b  |
    |----|----|----|
    | 2  | 10 | 60 |
    

    里面的数据 range_grp CTE将让您了解其工作原理:

    | id | a  | b   | max b over... | chg | grp |
    |----|----|-----|---------------|-----|-----|
    | 1  | 0  | 40  | NULL          | 1   | 1   |
    | 4  | 10 | 30  | 40            | 0   | 1   |
    | 2  | 40 | 60  | 40            | 0   | 1   |
    | 3  | 80 | 100 | 60            | 1   | 2   |