代码之家  ›  专栏  ›  技术社区  ›  ScalaBoy

如何在要素重要性图中显示原始要素名称?

  •  0
  • ScalaBoy  · 技术社区  · 6 年前

    y = XY.DELAY_MIN
    X = standardized_df
    
    train_X, test_X, train_y, test_y = train_test_split(X.as_matrix(), y.as_matrix(), test_size=0.25)
    
    my_imputer = preprocessing.Imputer()
    train_X = my_imputer.fit_transform(train_X)
    test_X = my_imputer.transform(test_X)
    
    xgb_model = XGBRegressor()
    
    # Add silent=True to avoid printing out updates with each cycle
    xgb_model = XGBRegressor(n_estimators=1000, learning_rate=0.05)
    xgb_model.fit(train_X, train_y, early_stopping_rounds=5, 
                 eval_set=[(test_X, test_y)], verbose=False)
    

    创建要素重要性图时,要素名称显示为“F1”、“F2”等。如何显示原始要素名称?

    fig, ax = plt.subplots(figsize=(12,18))
    xgb.plot_importance(xgb_model, max_num_features=30, height=0.8, ax=ax)
    plt.show()
    
    1 回复  |  直到 6 年前
        1
  •  2
  •   Mischa Lisovyi    6 年前

    问题是 Imputer pd.DataFrame transform() 因此,当您这样做时,您的列名会丢失。

    train_X = my_imputer.fit_transform(train_X)
    test_X = my_imputer.transform(test_X)
    

    train_X = pd.DataFrame(my_imputer.fit_transform(train_X), columns=train_X.columns)
    test_X  = pd.DataFrame(my_imputer.transform(test_X), columns=test_X.columns)