代码之家 › 专栏 › 技术社区 › MatrixManAtYrService

如何在整个子查询上使用group\u concat?

mysql python

MatrixManAtYrService · 技术社区 · 6 年前

……不做不必要的比较

我想得到一系列行的md5散列。由于带宽的限制,我希望它发生在服务器端。

这是有效的:

create table some_table (id int auto_increment,
                         col1 varchar(1),
                         col2 int,
                         primary key (id));

insert into some_table (col1, col2)
                values ('a', 1),
                       ('b', 11),
                       ('c', 12),
                       ('d', 25),
                       ('e', 50);

select group_concat(id,col1,col2) from
    (select * from some_table
     where id >= 2 and id < 5
     order by id desc) as some_table
group by 1 = 1;

输出:

+----------------------------+
| group_concat(id,col1,col2) |
+----------------------------+
| 2b11,3c12,4d25             |
+----------------------------+

加上杂凑:

select md5(group_concat(id,col1,col2)) from
    (select * from some_table
     where id >= 2 and id < 5
     order by id desc) as some_table
group by 1 = 1;

输出:

+----------------------------------+
| md5(group_concat(id,col1,col2))  |
+----------------------------------+
| 32c1f1dd34d3ebd33ca7d95f3411888e |
+----------------------------------+

但我觉得应该有更好的办法。

特别是,我想避免将1与100万次进行比较,这是我发现将行范围放入一个组所必需的,我需要这样做才能使用 group_concat ,我需要它来使用 md5 在多排上。

有没有办法 _concat组 (或类似)在行范围内,没有不必要的比较?

编辑

我想散列多行,以便可以比较不同服务器上的结果散列。如果它们是不同的,我可以断定子查询返回的行是不同的。

1 回复 | 直到 6 年前

MatrixManAtYrService 6 年前

解决办法就是省略 group by 1 = 1 完全是。我以为 group_concat 需要我为它提供一个组,但它可以直接用于子查询,如下所示:

select group_concat(id,col1,col2) from
    (select * from some_table
     where id >= 2 and id < 5
     order by id desc) as some_table;

请注意,空值需要强制转换为concat友好的值,例如:

insert into some_table (col1, col2)
                values ('a', 1),
                       ('b', 11),
                       ('c', NULL),
                       ('d', 25),
                       ('e', 50);

select group_concat(id, col1, col2) from
    (select id, col1, ifnull(col2, 'NULL') as col2
     from some_table
     where id >= 2 and id < 5
     order by id desc) as some_table;

输出:

+------------------------------+
| group_concat(id, col1, col2) |
+------------------------------+
| 2b11,3cNULL,4d25             |
+------------------------------+

另一个警告:mysql的最大长度为 _concat组 由变量定义: group_concat_max_len 是的。为了散列连接 n个表行,我需要:

散列行,使其以32位表示,而不管它有多少列
确保 group_concat_max_len > (n * 33) (额外的字节表示添加了逗号)
散列 _concat组 散列的行。

最终,我使用客户机语言检查了每一列的名称、编号和可空性,然后构建了如下查询:

select md5(group_concat(row_fingerprint)) from
    (select concat(id, col1, ifnull(col2, 'null')) as row_fingerprint
     from some_table
     where id >= 2 and id < 5
     order by id desc) as foo;

要了解更多细节,你可以浏览我的代码 here (请参见函数:find_diff_interval)。