这种排列有点混乱,每个相关位都有一个明显的移动距离。排列图如下所示(输出顶行):
不过,这确实说明了一些方法。如果我们靠近顶部看,每个“组”都是通过按升序从输入中收集一些位形成的,因此可以用7来完成
compress_right
操作aka
PEXT
所以如果
佩克斯
uint64_t g0 = _pext_u64(in, 0x8080808);
uint64_t g1 = _pext_u64(in, 0x404040404);
uint64_t g2 = _pext_u64(in, 0x20202020202);
uint64_t g3 = _pext_u64(in, 0x1010101010101);
uint64_t g4 = _pext_u64(in, 0x808080808080);
uint64_t g5 = _pext_u64(in, 0x404040404000);
uint64_t g6 = _pext_u64(in, 0x202020200000);
uint64_t out = g0 | (g1 << 7) | (g2 << 14) | (g3 << 21) |
(g4 << 28) | (g5 << 35) | (g6 << 42);
这种排列不能通过蝶形网络进行路由,但是Bene网络是通用的,因此可以工作。
所以可以用11个
these
置换步骤,也称为增量交换:
word bit_permute_step(word source, word mask, int shift) {
word t;
t = ((source >> shift) ^ source) & mask;
return (source ^ t) ^ (t << shift);
}
x = bit_permute_step(x, 0x1001400550054005, 1);
x = bit_permute_step(x, 0x2213223111023221, 2);
x = bit_permute_step(x, 0x01010B020104090E, 4);
x = bit_permute_step(x, 0x002900C400A7007B, 8);
x = bit_permute_step(x, 0x00000A0400002691, 16);
x = bit_permute_step(x, 0x0000000040203CAD, 32);
x = bit_permute_step(x, 0x0000530800001CE0, 16);
x = bit_permute_step(x, 0x000C001400250009, 8);
x = bit_permute_step(x, 0x0C00010403080104, 4);
x = bit_permute_step(x, 0x2012000011100100, 2);
x = bit_permute_step(x, 0x0141040000000010, 1);