代码之家  ›  专栏  ›  技术社区  ›  ameet chaubal

n维阵列缩减为带附加列的二维阵列

  •  1
  • ameet chaubal  · 技术社区  · 6 年前

    我在numpy中有一个n-dim数组,并且有n个列向量。 我需要将n-dim数组转换为具有

    rows = size of n-dim array

    cols = n + 1

    为了简化示例,

    a = np.random.randint(50, size=(2,2))
    r = np.array([0.2,1.9])
    c = np.array([4,5])
    a =>
    array([[45, 18], [ 4, 24]])
    c => array([4, 5])
    r => array([ 0.2,  1.9])
    

    我需要把它转换成以下内容,

    array([[ 45. ,   4. ,   0.2],
       [ 18. ,   5. ,   0.2],
       [  4. ,   4. ,   1.9],
       [ 24. ,   5. ,   1.9]])
    

    我写的如下,虽然我觉得这不是最好的解决方案, 但它确实有效,而且对于相对较大的价值来说似乎足够快,

    def get_2d_array(  arr, r, c):
        w = None
        for i in range(arr.shape[0]):
            rv = np.full((arr[i].shape[0], 1), r[i])
            z = np.concatenate((arr[i].reshape(-1, 1), c.reshape(-1, 1), rv), axis=1)
            if w is None:
                w = z
            else:
                w = np.concatenate((w, z))
        return w
    

    有没有其他方法可以在没有循环的情况下在numpy中执行此操作?

    此外,为了概括这一点,我实际上有一个4-D数组,我需要将其简化为具有类似结构的二维数组。我无法让递归函数工作,最终不得不显式地减少第四个和第三个dim,如下所示:

        def reduce_3d(self, arr3, row, col, third_dim_array):
        x = None
        for i in range(arr3.shape[0]):
            x1 = self.reduce_2d(arr3[i], row, col)
            third_array = np.full((x1.shape[0], 1), third_dim_array[i])
            x1 = np.concatenate((x1, third_array), axis=1)
            if x is None:
                x = x1
            else:
                x = np.concatenate((x, x1))
        return x
    
        def reduce_4d(air_temp ,row, col, third, second):
        w = None
        for j in range(air_temp.shape[0]):
            w1 = self.reduce_3d(air_temp[j], row, col, third)
            second_arr = np.full((w1.shape[0], 1), second[j])
            w1 = np.concatenate((w1, second_arr), axis=1)
            if w is None:
                w = w1
            else:
                w = np.concatenate((w, w1))
        return w
    

    一个4-D示例的输出如下:

    a = np.random.randint(100, size=(2,3,2,2))
    array([[[[ 8, 38],
         [89, 95]],
        [[63, 82],
         [24, 27]],
        [[22, 18],
         [25, 30]]],
       [[[94, 21],
         [83,  9]],
        [[25, 98],
         [84, 57]],
        [[89, 20],
         [40, 60]]]])
    
    r   Out[371]: array([ 0.2,  1.9])
    c   Out[372]: array([4, 5])
    third array([ 50, 100, 150])
    second array([[datetime.date(2009, 1, 1)],
       [datetime.date(2010, 5, 4)]], dtype=object)
    
    z = reduce_4d(a,r,c,third,second)
    z
    
    array([[8.0, 4.0, 0.2, 50.0, datetime.date(2009, 1, 1)],
       [38.0, 5.0, 0.2, 50.0, datetime.date(2009, 1, 1)],
       [89.0, 4.0, 1.9, 50.0, datetime.date(2009, 1, 1)],
       [95.0, 5.0, 1.9, 50.0, datetime.date(2009, 1, 1)],
       [63.0, 4.0, 0.2, 100.0, datetime.date(2009, 1, 1)],
       [82.0, 5.0, 0.2, 100.0, datetime.date(2009, 1, 1)],
       [24.0, 4.0, 1.9, 100.0, datetime.date(2009, 1, 1)],
       [27.0, 5.0, 1.9, 100.0, datetime.date(2009, 1, 1)],
       [22.0, 4.0, 0.2, 150.0, datetime.date(2009, 1, 1)],
       [18.0, 5.0, 0.2, 150.0, datetime.date(2009, 1, 1)],
       [25.0, 4.0, 1.9, 150.0, datetime.date(2009, 1, 1)],
       [30.0, 5.0, 1.9, 150.0, datetime.date(2009, 1, 1)],
       [94.0, 4.0, 0.2, 50.0, datetime.date(2010, 5, 4)],
       [21.0, 5.0, 0.2, 50.0, datetime.date(2010, 5, 4)],
       [83.0, 4.0, 1.9, 50.0, datetime.date(2010, 5, 4)],
       [9.0, 5.0, 1.9, 50.0, datetime.date(2010, 5, 4)],
       [25.0, 4.0, 0.2, 100.0, datetime.date(2010, 5, 4)],
       [98.0, 5.0, 0.2, 100.0, datetime.date(2010, 5, 4)],
       [84.0, 4.0, 1.9, 100.0, datetime.date(2010, 5, 4)],
       [57.0, 5.0, 1.9, 100.0, datetime.date(2010, 5, 4)],
       [89.0, 4.0, 0.2, 150.0, datetime.date(2010, 5, 4)],
       [20.0, 5.0, 0.2, 150.0, datetime.date(2010, 5, 4)],
       [40.0, 4.0, 1.9, 150.0, datetime.date(2010, 5, 4)],
       [60.0, 5.0, 1.9, 150.0, datetime.date(2010, 5, 4)]], dtype=object)
    
    z.shape ==> (24L, 5L)
    z.size => 120
    a.size ==> 24
    
    z.shape[0] == a.size
    a.shape[1] == a.ndim + 1
    

    有没有更好、更有效的方法?

    非常感谢

    2 回复  |  直到 6 年前
        1
  •  1
  •   kuppern87    6 年前

    这里有一个解决方案 np.meshgrid 要创建列组合并将其堆叠在一起,请使用 np.vstack :

    In [101]: a = np.array([[45, 18], [ 4, 24]])
    
    In [102]: col_vecs = [np.array([4, 5]), np.array([0.2, 1.9])]
    
    In [103]: np.vstack([np.ravel(a)] + [c.ravel() for c in np.meshgrid(*col_vecs)]).T
    Out[103]: 
    array([[45. ,  4. ,  0.2],
           [18. ,  5. ,  0.2],
           [ 4. ,  4. ,  1.9],
           [24. ,  5. ,  1.9]])
    

    这同样适用于更高的尺寸

        2
  •  0
  •   ameet chaubal    6 年前

    我偶然发现了另一种不那么复杂的方法,只是在这里提到, 请更正或添加其他选项…谢谢

    def reduce( a, dims):
        """
        iterate over the dimensions of the array and 
         progressive build the columns through a combination of
        `tile` and `repeat`
        :param a: the input array of multi-dimensions
        :param dims: an array of feature vectors of size (n,) 
         in order of last one first.
        i.e. the first element of this array is an np array that matches or 
        corresponds to the last dimension in a
        :return: 
        """
        item_count = a.size
        m_all = a.reshape((-1, 1))
        repeat_cnt = 1
        level = 0
        for i in range(a.ndim):
            if level == 0:
                repeat_cnt = 1
                level = -1
            else:
                repeat_cnt = a.shape[level] * repeat_cnt
                level = level - 1
            cur_array = dims[i]
            tile_cnt = item_count / (cur_array.size * repeat_cnt)
            cur_col = np.tile(np.repeat(cur_array, repeat_cnt), tile_cnt).reshape((-1, 1))
            m_all = np.concatenate((m_all, cur_col), axis=1)
        return m_all