代码之家  ›  专栏  ›  技术社区  ›  De Gninou

将交叉验证算法转换为模型选择

  •  0
  • De Gninou  · 技术社区  · 6 年前

    2016年,我使用下面的代码运行了一个Lasso回归模型:

    #Import required packages 
    import pandas as pd
    import numpy as np
    import matplotlib as mpl
    import matplotlib.pylab as plt
    import matplotlib.pyplot as plp
    import seaborn as sns
    import statsmodels.formula.api as smf
    from scipy import stats
    from sklearn.cross_validation import train_test_split
    from sklearn.linear_model import LassoLarsCV
    
    # split data into train and test sets
    pred_train, pred_test, tar_train, tar_test = train_test_split(predictors, target, test_size=.4, random_state=123)
    #%
    # specify the lasso regression model
    model=LassoLarsCV(cv=10, precompute=False).fit(pred_train,tar_train)
    #%
    # print variable names and regression coefficients
    dict(zip(predictors.columns, model.coef_))
    #regcoef.to_csv('variable+regresscoef.csv')
    #%%
    # plot coefficient progression
    m_log_alphas = -np.log10(model.alphas_)
    ax = plt.gca()
    plt.plot(m_log_alphas, model.coef_path_.T)
    plt.axvline(-np.log10(model.alpha_), linestyle='--', color='k',
                label='alpha CV')
    plt.ylabel('Regression Coefficients')
    plt.xlabel('-log(alpha)')
    plt.title('Regression Coefficients Progression for Lasso Paths')
    #%
    # plot mean square error for each fold
    m_log_alphascv = -np.log10(model.cv_alphas_)
    plt.figure()
    plt.plot(m_log_alphascv, model.cv_mse_path_, ':')
    plt.plot(m_log_alphascv, model.cv_mse_path_.mean(axis=-1), 'k',
             label='Average across the folds', linewidth=2)
    plt.axvline(-np.log10(model.alpha_), linestyle='--', color='k',
                label='alpha CV')
    plt.legend()
    plt.xlabel('-log(alpha)')
    plt.ylabel('Mean squared error')
    plt.title('Mean squared error on each fold')
    #%       
    # MSE from training and test data
    from sklearn.metrics import mean_squared_error
    train_error = mean_squared_error(tar_train, model.predict(pred_train))
    test_error = mean_squared_error(tar_test, model.predict(pred_test))
    print ('training data MSE')
    print(train_error)
    print ('test data MSE')
    print(test_error)
    #%
    # R-square from training and test data
    rsquared_train=model.score(pred_train,tar_train)
    rsquared_test=model.score(pred_test,tar_test)
    print ('training data R-square')
    print(rsquared_train)
    print ('test data R-square')
    print(rsquared_test)
    

    现在我想再次运行它并收到以下警告:

    DeprecationWarning:版本0.18中已弃用此模块 有利于所有重构的模型选择模块 类和函数被移动。

    我怎样才能用 model_selection ?

    1 回复  |  直到 6 年前
        1
  •  2
  •   Vivek Kumar    6 年前

    cross_validation train_test_split

    from sklearn.cross_validation import train_test_split
    

    from sklearn.model_selection import train_test_split