代码之家  ›  专栏  ›  技术社区  ›  beingalex

我正在使用NOT IN,但速度很慢

  •  2
  • beingalex  · 技术社区  · 7 年前

    客户关系管理 表格示例:

    `crm` example:
    +----+--------+---------------------+--------------------+
    | id | name   |         date        |      status        |
    +----+--------+---------------------+--------------------+
    | 1  | john   | 2017-12-27 10:58:10 | A status           |
    | 2  | steve  | 2017-12-27 10:58:08 | A status           |
    | 3  | eric   | 2017-12-27 10:58:04 | Delivery Arranged  |
    | 4  | phil   | 2017-12-27 10:57:55 | A status           |
    | 5  | bob    | 2017-12-27 10:57:52 | A status           |
    | 6  | foo    | 2017-12-27 10:57:50 | A status           |
    | 7  | steven | 2017-12-27 10:57:48 | Delivery Arranged  |
    | 8  | paul   | 2017-12-27 10:57:43 | A status           |
    | 9  | alex   | 2017-12-27 10:57:31 | Delivery Arranged  |
    

    我的查询的目的是返回 crm 行,其中 status 已安排交货 ,以及 date 介于 2017-12-01 2018-01-01 .

    这是我的主要 查询 :

    SET @from='2017-12-01';
    SET @to='2018-01-01';
    
    
    SELECT 
            COUNT(*) AS `delivery_arranged`
        FROM
            `crm` a
        WHERE
            a.`status` = 'Delivery Arranged'
                AND DATE(a.`date`) BETWEEN @from AND @to
    

    结果:

    +---------------------+
    |   delivery_arranged |
    +---------------------+
    | 30                  |
    

    很好。但我想 折扣 以前有一次的行(实际上,除了此日期范围之外)被设置为 已安排交货 . 我有一个 状态日志 我可以用于此的表:

    状态日志 表格示例:

    `statuslog` example:
    +--------+-------+---------------------+-----------+---------------------+
    |   id   | crmid |        date         |   user    |        status       |
    +--------+-------+---------------------+-----------+---------------------+
    | 818572 | 1     | 2017-12-27 10:58:10 | johnsmith  | Some status change |
    | 818571 | 2     | 2017-12-27 10:58:08 | johnsmith  | Some status change |
    | 818570 | 3     | 2017-12-27 10:58:04 | another    | Delivery Arranged  |
    | 818569 | 4     | 2017-12-27 10:57:55 | another    | Delivery Arranged  |
    | 818568 | 5     | 2017-12-27 10:57:52 | johnsmith  | Some status change | 
    | 818567 | 6     | 2017-12-27 10:57:50 | another    | Some status change |
    | 818566 | 7     | 2017-12-27 10:57:48 | johnsmith  | Delivery Arranged  |
    | 818565 | 8     | 2017-12-27 10:57:43 | another    | Some status change |
    | 818564 | 9     | 2017-12-27 10:57:31 | johnsmith  | Some status change |
    

    通过这个表,我可以从 statuslog 不在日期范围内,然后执行 NOT IN :

    SELECT 
            COUNT(*) AS `delivery_arranged`
        FROM
            `crm` a
        WHERE
            a.`status` = 'Delivery Arranged'
                AND DATE(a.`date`) BETWEEN @from AND @to
                AND a.`id` 
                NOT IN (
                SELECT 
                    a.crmid AS `crmid`
                FROM
                    statuslog a
                WHERE
                    a.status = 'Delivery Arranged'
                        AND DATE(a.`date`) NOT BETWEEN @from AND @to
                GROUP BY a.crmid
                ORDER BY a.`date` DESC
                )
    

    这是可行的,但取决于日期范围的大小,它可能需要很长时间! 状态日志 已(>);2000000行。

    如何使此查询更快?

    2 回复  |  直到 7 年前
        1
  •  0
  •   dani herrera    7 年前

    左联接可能比代理子查询更有效:

    SELECT 
        COUNT(*) AS `delivery_arranged`
    FROM
        `crm` a
    LEFT OUTER JOIN
        (
            SELECT 
                a.crmid AS `crmid`
            FROM
                statuslog a
            WHERE
                a.status = 'Delivery Arranged'
                    AND DATE(a.`date`) NOT BETWEEN @from AND @to
            GROUP BY a.crmid
            --ORDER BY a.`date` DESC --<-- this has no sense
        ) b
      on a.`id` = b.crmid
    WHERE
        b.crmid is null and   --<- not int translated to left join
        a.`status` = 'Delivery Arranged'
        AND DATE(a.`date`) BETWEEN @from AND @to
    

    此外,请记住使用正确的索引。

        2
  •  0
  •   Gordon Linoff    7 年前

    如果使用 LEFT JOIN / WHERE :

    SELECT COUNT(*) AS delivery_arranged
    FROM crm c LEFT JOIN
         statuslog sl
         ON sl.crmid = c.id AND
            sl.status = 'Delivery Arranged'
            sl.date >= @from AND
            sl.date < @to + INTERVAL 1 DAY
    WHERE c.status = 'Delivery Arranged' AND
          c.date >= @from AND
          c.date < @to + INTERVAL 1 DAY AND
          sl.crmid IS NULL;
    

    对于此版本,您需要索引 crm(status, date, id) statuslog(crmid, status, date) .

    请注意,这会更改日期比较,以避免对列进行函数调用。这使得使用包含 date