代码之家  ›  专栏  ›  技术社区  ›  PixelPioneer

DyType对象不支持唯一的轴参数

  •  0
  • PixelPioneer  · 技术社区  · 6 年前

    我正在尝试按列获取唯一计数,但我的数组具有分类变量(dtype对象)

    val, count = np.unique(x, axis=1, return_counts=True)
    

    TypeError: The axis argument to unique is not supported for dtype object
    

    我如何解决这个问题?

    样品x:

    array([[' Private', ' HS-grad', ' Divorced'],
           [' Private', ' 11th', ' Married-civ-spouse'],
           [' Private', ' Bachelors', ' Married-civ-spouse'],
           [' Private', ' Masters', ' Married-civ-spouse'],
           [' Private', ' 9th', ' Married-spouse-absent'],
           [' Self-emp-not-inc', ' HS-grad', ' Married-civ-spouse'],
           [' Private', ' Masters', ' Never-married'],
           [' Private', ' Bachelors', ' Married-civ-spouse'],
           [' Private', ' Some-college', ' Married-civ-spouse']], dtype=object)
    

    for x_T in x.T:
        val, count = np.unique(x_T, return_counts=True)
        print (val,count)
    
    
    [' Private' ' Self-emp-not-inc'] [8 1]
    [' 11th' ' 9th' ' Bachelors' ' HS-grad' ' Masters' ' Some-college'] [1 1 2 2 2 1]
    [' Divorced' ' Married-civ-spouse' ' Married-spouse-absent'
     ' Never-married'] [1 6 1 1]
    
    1 回复  |  直到 6 年前
        1
  •  2
  •   Lukas Humpe    6 年前

    您可以使用Itemfreq event,如果输出与您的不同,则它会提供所需的计数:

    import numpy as np
    from scipy.stats import itemfreq
    
    x = np. array([[' Private', ' HS-grad', ' Divorced'],
           [' Private', ' 11th', ' Married-civ-spouse'],
           [' Private', ' Bachelors', ' Married-civ-spouse'],
           [' Private', ' Masters', ' Married-civ-spouse'],
           [' Private', ' 9th', ' Married-spouse-absent'],
           [' Self-emp-not-inc', ' HS-grad', ' Married-civ-spouse'],
           [' Private', ' Masters', ' Never-married'],
           [' Private', ' Bachelors', ' Married-civ-spouse'],
           [' Private', ' Some-college', ' Married-civ-spouse']], dtype=object)
    
    itemfreq(x)
    

    输出:

    array([[' 11th', 1],
           [' 9th', 1],
           [' Bachelors', 2],
           [' Divorced', 1],
           [' HS-grad', 2],
           [' Married-civ-spouse', 6],
           [' Married-spouse-absent', 1],
           [' Masters', 2],
           [' Never-married', 1],
           [' Private', 8],
           [' Self-emp-not-inc', 1],
           [' Some-college', 1]], dtype=object)
    

    否则,您可以尝试指定另一个数据类型,如:

    val, count = np.unique(x.astype("<U22"), axis=1, return_counts=True)