代码之家  ›  专栏  ›  技术社区  ›  Carl Smotricz

SQL搜索和销毁重复项

  •  0
  • Carl Smotricz  · 技术社区  · 15 年前

    我有一个带有字段的表(简化):

    id, fld1, fld2, fld3.
    

    id 是数字主键字段。

    有重复项: 身份证件 不同但 fld1 , fld2 fld3 两行或多行相同。当然,也有一些条目的值只出现一次,即不重复。

    在每一组重复的条目中,我只想保留ID最高的条目。我计划先列出注定要失败的行,然后删除它们。

    我第一次尝试的是:

    SELECT * FROM tab1 t1 WHERE EXISTS (
      SELECT COUNT(*) FROM tab1 t2
      WHERE t1.fld1 = t2.fld1 AND t1.fld2 = t2.fld2 AND t1.fld3 = t2.fld3
        AND t1.id < MAX(t2.id)
      HAVING COUNT(*) > 1
      GROUP BY t2.fld1, t2.fld2, t2.fld3)
    

    但是(在甲骨文中)我得到了 Missing right parenthesis 错误消息。我认为这完全需要一种新的方法,但我的SQL fu不能胜任这项任务。感谢您的帮助!


    编辑:

    使用“真实”数据字段:

    select x.leg_id, x.airline_des, x.flight_nr, x.suffix, x.flight_id_date, x.lt_flight_id_date
    from fdb_leg x
    join ( select max(t.leg_id) 'max_id', 
    t.airline_des, t.flight_nr, t.suffix, t.flight_id_date, t.lt_flight_id_date
    from fdb_leg t
    group by t.airline_des, t.flight_nr, t.suffix, t.flight_id_date, t.lt_flight_id_date
    having count(*) > 1) y on y.max_id > x.leg_id
    and y.airline_des = x.airline_des and y.flight_nr = x.flight_nr and y.suffix = x.suffix
    and y.flight_id_date = x.flight_id_date and x.lt_flight_id_date = y.lt_flight_id_date
    

    反应是:

    ORA-00923:在预期的位置找不到FROM关键字

    3 回复  |  直到 15 年前
        1
  •  2
  •   OMG Ponies    15 年前

    Oracle 9i+,用于:


    要获取注定失败的条目列表,请使用:

    WITH keepers AS (
       SELECT MAX(t.id) 'max_id',
              t.fld1, t.fld2, t.fld3
         FROM TABLE_1 t
     GROUP BY t.fld1, t.fld2, t.fld3
       HAVING COUNT(*) > 1)
    SELECT x.id,
           x.fld1, x.fld2, x.fld3
      FROM TABLE_1 x
      JOIN keepers y ON y.max_id > x.id
                    AND y.fld1 = x.fld1
                    AND y.fld2 = x.fld2
                    AND y.fld3 = x.fld3
    

    不含等价物:


    要获取注定失败的条目列表,请使用:

    SELECT x.id,
           x.fld1, x.fld2, x.fld3
      FROM TABLE_1 x
      JOIN (SELECT MAX(t.id) 'max_id',
                   t.fld1, t.fld2, t.fld3
              FROM TABLE_1 t
          GROUP BY t.fld1, t.fld2, t.fld3
            HAVING COUNT(*) > 1) y ON y.max_id > x.id
                                  AND y.fld1 = x.fld1
                                  AND y.fld2 = x.fld2
                                  AND y.fld3 = x.fld3
    
        2
  •  2
  •   Rob van Wijk    15 年前

    您可以一次删除它们,如下所示:

    SQL> create table mytable (id, fld1, fld2, fld3)
      2  as
      3  select 1, 1, 1, 1 from dual union all
      4  select 2, 1, 1, 1 from dual union all
      5  select 3, 2, 2, 2 from dual union all
      6  select 4, 2, 3, 2 from dual union all
      7  select 5, 2, 3, 2 from dual union all
      8  select 6, 2, 3, 2 from dual
      9  /
    
    Table created.
    
    SQL> delete mytable
      2   where id not in
      3         ( select max(id)
      4             from mytable
      5            group by fld1
      6                , fld2
      7                , fld3
      8         )
      9  /
    
    3 rows deleted.
    
    SQL> select * from mytable
      2  /
    
            ID       FLD1       FLD2       FLD3
    ---------- ---------- ---------- ----------
             2          1          1          1
             3          2          2          2
             6          2          3          2
    
    3 rows selected.
    

    当做, 罗布。

        3
  •  -1
  •   JeffO    15 年前

    呃,我明白了。搔那个。

    这将标识需要删除的ID。

    Select 
           fld1
         , fld2
         , fld3
         , Max(ID)
    From table_name
    Group By
           fld1
         , fld2
         , fld3