代码之家  ›  专栏  ›  技术社区  ›  Hadsga

模型中存在错误。框架默认值(Terms,newdata,na.action=na.action,xlev=object$xlevels):因子X有新的级别

  •  6
  • Hadsga  · 技术社区  · 7 年前

     EW <- glm(everwrk~age_p + r_maritl, data = NH11, family = "binomial")
    

    此外,我想预测 everwrk r_maritl

    r\u maritl

    levels(NH11$r_maritl)
     "0 Under 14 years" 
     "1 Married - spouse in household" 
     "2 Married - spouse not in household"
     "3 Married - spouse in household unknown" 
     "4 Widowed"                               
     "5 Divorced"                             
     "6 Separated"                             
     "7 Never married"                        
     "8 Living with partner"  
     "9 Unknown marital status"  
    

    predEW <- with(NH11,
    expand.grid(r_maritl = c( "0 Under 14 years", "1 Married - 
    spouse in household", "2 Married - spouse not in household", "3 Married - 
    spouse in household unknown", "4 Widowed", "5 Divorced", "6 Separated", "7 
    Never married", "8 Living with partner", "9 Unknown marital status"),
    age_p = mean(age_p,na.rm = TRUE)))
    
    cbind(predEW, predict(EW, type = "response",
                            se.fit = TRUE, interval = "confidence",
                            newdata = predEW))
    

    问题是我得到了以下回应:

    -未知家庭中的配偶

    str(NH11$age_p)
    num [1:33014] 47 18 79 51 43 41 21 20 33 56 ...
    
    str(NH11$everwrk)
    Factor w/ 2 levels "2 No","1 Yes": NA NA 2 NA NA NA NA NA 2 2 ...
    
    str(NH11$r_maritl)
    Factor w/ 10 levels "0 Under 14 years",..: 6 8 5 7 2 2 8 8 8 2 ...
    
    2 回复  |  直到 7 年前
        1
  •  12
  •   Ben Bolker    7 年前

    tl;博士 看起来您的因子中有一些级别没有在数据中表示,这些级别是从模型中使用的因子中删除的。事后看来,这并不奇怪,因为你无法预测这些水平的反应。也就是说 轻微地 NA 自动设置值。您可以使用 levels(droplevels(NH11$r_maritl)) 在构建预测框架时,或等效地 EW$xlevels$r_maritl

    可复制示例:

    maritl_levels <- c( "0 Under 14 years", "1 Married - spouse in household", 
      "2 Married - spouse not in household", "3 Married - spouse in household unknown", 
      "4 Widowed", "5 Divorced", "6 Separated", "7 Never married", "8 Living with partner", 
     "9 Unknown marital status")
    set.seed(101)
    NH11 <- data.frame(everwrk=rbinom(1000,size=1,prob=0.5),
                     age_p=runif(1000,20,50),
                     r_maritl = sample(maritl_levels,size=1000,replace=TRUE))
    

    让我们做一个缺失的级别:

    NH11 <- subset(NH11,as.numeric(NH11$r_maritl) != 3)
    

    EW <- glm(everwrk~r_maritl+age_p,data=NH11,family=binomial)
    predEW <- with(NH11,
      expand.grid(r_maritl=levels(r_maritl),age_p=mean(age_p,na.rm=TRUE)))
    predict(EW,newdata=predEW)
    

    模型中存在错误。框架默认值(Terms,newdata,na.action=na.action,xlev=object$xlevels):

    predEW <- with(NH11,
               expand.grid(r_maritl=EW$xlevels$r_maritl,age_p=mean(age_p,na.rm=TRUE)))
    predict(EW,newdata=predEW)
    
        2
  •  0
  •   Nad Pat    3 年前

    非常感谢你的回答,我也面临着同样的问题,新的水平。

    1. 我正在使用 data.frame() expand.grid() 作用
    2. na.rm=TRUE 以及均值函数中的变量
    3. glmoutput$xlevels$variablename

    而且解决方案有效!