代码之家  ›  专栏  ›  技术社区  ›  0xgareth

Python/Sklearn-索引器错误-索引超出界限

  •  1
  • 0xgareth  · 技术社区  · 7 年前

    下面是我的代码

    filename = 'train4.csv'
    names = ['attribut names are here']
    dataframe = read_csv(filename, names=names)
    array = dataframe.values
    X = array[:,0:47]
    Y = array[:,47]
    num_folds = 10
    kfold = KFold(n_splits=10, random_state=7)
    model = KNeighborsClassifier()
    results = cross_val_score(model, X, Y, cv=kfold)
    print(results.mean())
    

    我得到了错误

    >IndexError                                Traceback (most recent call last)
    <ipython-input-19-8d9596c3368b> in <module>()
          4 array = dataframe.values
          5 X = array[:,0:47]
    ----> 6 Y = array[:,47]
          7 num_folds = 10
          8 kfold = KFold(n_splits=10, random_state=7)
    
    > IndexError: index 47 is out of bounds for axis 1 with size 47
    

    在我的CSV中,第47个属性是目标标签-因此是48(我错了吗?)。

    我正在Jupyter笔记本上运行pandas/sklearn。

    谢谢

    1 回复  |  直到 7 年前
        1
  •  1
  •   MaxU - stand with Ukraine    7 年前

    试试这个:

    import pandas as pd
    
    filename = 'train4.csv'
    names = ['attribut names are here']
    target_col_name = 'name_of_your_target_column'
    
    df = pd.read_csv(filename, names=names)
    
    num_folds = 10
    kfold = KFold(n_splits=10, random_state=7)
    model = KNeighborsClassifier()
    results = cross_val_score(model,
                              df.drop(target_col_name, axis=1), 
                              df[target_col_name],
                              cv=kfold)
    print(results.mean())