为了使您的分类器与Google云机器学习引擎(CMLE)兼容,您需要从管道中分离出预处理器和LogisticRegression分类器。您将需要在客户端执行预处理逻辑,独立分类器将托管在CMLE上。
...
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
preprocessor = ColumnTransformer(
transformers=[
('num', numeric_transformer, numeric_features),
('cat', categorical_transformer, categorical_features)])
model = LogisticRegression(solver='lbfgs')
X_train_transformed = preprocessor.fit_transform(X_train)
model.fit(X_train_transformed, y_train)
print("model score: %.3f" % model.score(preprocessor.transform(X_test), y_test))
您可以导出模型(使用pickle或joblib)并将其部署到CMLE上。在构造对CMLE的json请求以进行预测时,首先需要使用以下命令将数据帧预处理为二维数组:
preprocessor.transform(X_test)
.