我以这个例子为中心:
support vector machine train caret error kernlab class probability calculations failed; returning NAs
采样代码
library(caret)
trainset <- data.frame(
class=factor(c("Good", "Bad", "Good", "Good", "Bad", "Good", "Good", "Good", "Good", "Bad", "Bad", "Bad")),
age=c(67, 22, 49, 45, 53, 35, 53, 35, 61, 28, 25, 24))
testset <- data.frame(
class=factor(c("Good", "Bad", "Good" )),
age=c(64, 23, 50))
library(kernlab)
set.seed(231)
### finding optimal value of a tuning parameter
sigDist <- sigest(class ~ ., data = trainset, frac = 1)
### creating a grid of two tuning parameters, .sigma comes from the earlier line. we are trying to find best value of .C
svmTuneGrid <- data.frame(.sigma = sigDist[1], .C = 2^(-2:7))
set.seed(1056)
svmFit <- train(class ~ .,
data = trainset,
method = "svmRadial",
preProc = c("center", "scale"),
tuneGrid = svmTuneGrid,
trControl = trainControl(method = "repeatedcv", repeats = 5,
classProbs = TRUE))
predictedClasses <- predict(svmFit, testset )
predictedProbs <- predict(svmFit, newdata = testset , type = "prob")
使用公式接口,这段代码运行得非常好。然而,如果我把它翻过来使用矩阵形式,则在预测和返回错误(NAs)时不会计算类概率。见下文。
set.seed(1056)
svmFit <- train(x = trainset["age"], y = trainset$class,
method = "svmRadial",
preProc = c("center", "scale"),
tuneGrid = svmTuneGrid,
trControl = trainControl(method = "repeatedcv", repeats = 5, classProbs = TRUE))
predictedProbs <- predict(svmFit, newdata = testset , type = "prob")
只是想弄明白为什么它不会使用非公式接口计算预测数据集的概率。抛出此警告:
Warning message:
In method$prob(modelFit = modelFit, newdata = newdata, submodels = param) :
kernlab class probability calculations failed; returning NAs