代码之家 › 专栏 › 技术社区 › Rick SilentGhost

求不同形状的二维阵列的笛卡尔乘积,并将它们逐行水平连接成1个阵列

numpy python

-1

Rick SilentGhost · 技术社区 · 3 年前

我对numpy有点陌生,很难找到一种有效执行我认为可能是一项简单任务的好方法。我怀疑在numpy中有一种直接的方法可以做到这一点,但经过大量搜索,找不到任何可以直接做到的东西。

我有两个2D阵列,如下所示:

>>> ident2 = np.identity(2)
>>> ident3 = np.identity(3)
>>> ident2
array([[1., 0.],
       [0., 1.]])
>>> ident3
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

我想创建一个这样的数组,它是上面两个数组的笛卡尔乘积,但沿着行连接:

array([[1, 0, 0, 1, 0],
       [1, 0, 0, 0, 1],
       [0, 1, 0, 1, 0],
       [0, 1, 0, 0, 1],
       [0, 0, 1, 1, 0],
       [0, 0, 1, 0, 1]])

到目前为止,我已经能够使用 itertools.product 这样地:

>>> x=np.array([*itertools.product(ident2, ident3)])
>>> x
array([[array([1., 0.]), array([1., 0., 0.])],
       [array([1., 0.]), array([0., 1., 0.])],
       [array([1., 0.]), array([0., 0., 1.])],
       [array([0., 1.]), array([1., 0., 0.])],
       [array([0., 1.]), array([0., 1., 0.])],
       [array([0., 1.]), array([0., 0., 1.])]], dtype=object)

但是,我很难找到一种可读、高效的方法来将行中的数组连接到最终数组中。这项工作:

>>> np.stack([np.concatenate(arrays) for arrays in x])
array([[1., 0., 0., 1., 0.],
       [1., 0., 0., 0., 1.],
       [0., 1., 0., 1., 0.],
       [0., 1., 0., 0., 1.],
       [0., 0., 1., 1., 0.],
       [0., 0., 1., 0., 1.]])

上面的内容可读性很强,但由于它不仅使用了本地numpy方法,而且使用了列表理解,所以我认为它会很慢。

以下是我发现的唯一一种不使用列表理解的方法:

>>> np.stack(np.array_split(np.hstack(np.concatenate(x)), 6))
array([[1., 0., 0., 1., 0.],
       [1., 0., 0., 0., 1.],
       [0., 1., 0., 1., 0.],
       [0., 1., 0., 0., 1.],
       [0., 0., 1., 1., 0.],
       [0., 0., 1., 0., 1.]])

但它极其复杂。将来我怎么能回来读这本书,了解世界上发生了什么?它还需要单独的、初始的 itertools.product 步骤,我假设一个更高效的本地numpy方法可能不需要这个步骤。

必须有更好的方法。构造这两个2D阵列的逐行连接笛卡尔乘积的规范方法是什么?

0 回复 | 直到 3 年前

hpaulj 3 年前

混合使用怎么样 repeat 和 tile (其本身使用 重复 ):

In [75]: >>> ident2 = np.identity(2)
    ...: >>> ident3 = np.identity(3)
In [76]: np.repeat(ident3,repeats=2,axis=0)
Out[76]: 
array([[1., 0., 0.],
       [1., 0., 0.],
       [0., 1., 0.],
       [0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 1.]])
In [77]: np.tile(ident2,(3,1))
Out[77]: 
array([[1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.]])
In [78]: np.hstack((__,_))
Out[78]: 
array([[1., 0., 0., 1., 0.],
       [1., 0., 0., 0., 1.],
       [0., 1., 0., 1., 0.],
       [0., 1., 0., 0., 1.],
       [0., 0., 1., 1., 0.],
       [0., 0., 1., 0., 1.]])

Rick SilentGhost 3 年前

基于公认的答案:为了完整性,这里有一个更通用的解决方案。

def row_by_row_concatenation_of_two_arrays(arr0, arr1):
    """Combine two arrays using row by row concatenation. 
    
    Example
    =======

    arr0:           arr1:
    [[1,2,3],       [[1,2],
    [4,5,6],        [3,4]]
    [7,8,9]]

    Into form of:
    
    [[1,2,3,1,2],
    [1,2,3,3,4],
    [4,5,6,1,2],
    [4,5,6,3,4],
    [7,8,9,1,2],
    [7,8,9,3,4]]
    """

    arr0_repeated = np.repeat(arr0, repeats=arr1.shape[0], axis=0)
    arr1_tiled = np.tile(arr1, (arr0.shape[0], 1))
    return np.hstack((arr0_repeated, arr1_tiled))

使用该函数 functools.reduce :

In [11]: from module import row_by_row_concatenation_of_two_arrays
In [12]: import functools
In [13]: x=((1.2,1.6),(0.5,0.5,0.5), (1,2))

In [14]: diags=[np.diag(group) for group in x]

In [15]: diags
Out[15]:
[array([[1.2, 0. ],
        [0. , 1.6]]),
 array([[0.5, 0. , 0. ],
        [0. , 0.5, 0. ],
        [0. , 0. , 0.5]]),
 array([[1, 0],
        [0, 2]])]
In [45]: functools.reduce(row_by_row_concatenation_of_two_arrays, diags)
Out[45]:
array([[1.2, 0. , 0.5, 0. , 0. , 1. , 0. ],
       [1.2, 0. , 0.5, 0. , 0. , 0. , 2. ],
       [1.2, 0. , 0. , 0.5, 0. , 1. , 0. ],
       [1.2, 0. , 0. , 0.5, 0. , 0. , 2. ],
       [1.2, 0. , 0. , 0. , 0.5, 1. , 0. ],
       [1.2, 0. , 0. , 0. , 0.5, 0. , 2. ],
       [0. , 1.6, 0.5, 0. , 0. , 1. , 0. ],
       [0. , 1.6, 0.5, 0. , 0. , 0. , 2. ],
       [0. , 1.6, 0. , 0.5, 0. , 1. , 0. ],
       [0. , 1.6, 0. , 0.5, 0. , 0. , 2. ],
       [0. , 1.6, 0. , 0. , 0.5, 1. , 0. ],
       [0. , 1.6, 0. , 0. , 0.5, 0. , 2. ]])

我猜对于广义情况,有一种更快的方法可以做到这一点,但这符合我的目的。