代码之家  ›  专栏  ›  技术社区  ›  Laura

如何将变量名和另一个变量名的数据框与回归数据相匹配?

  •  1
  • Laura  · 技术社区  · 6 年前

    我有两个数据帧:

    x = data.frame(Var1= c("A", "B", "C", "D","E"),Var2=c("F","G","H","I","J"),
        Value= c(11, 12, 13, 14,18))
    
    y = data.frame(A= c(11, 12, 13, 14,18), B= c(15, 16, 17, 14,18),C= c(17, 22, 23, 24,18), D= c(11, 12, 13, 34,18),E= c(11, 5, 13, 55,18),  F= c(8, 12, 13, 14,18),G= c(7, 5, 13, 14,18),
        H= c(8, 12, 13, 14,18), I= c(9, 5, 13, 14,18), J= c(11, 12, 13, 14,18))
    
    Var3 <- rep("time", each=length(x$Var1))
    
    x=cbind(x,Var3)
    
    time=seq(1:length(y[,1]))
    y=cbind(y,time)
    

    > x
      Var1 Var2 Value Var3
    1    A    F    11 time
    2    B    G    12 time
    3    C    H    13 time
    4    D    I    14 time
    5    E    J    18 time
    > y
       A  B  C  D  E  F  G  H  I  J time
    1 11 15 17 11 11  8  7  8  9 11    1
    2 12 16 22 12  5 12  5 12  5 12    2
    3 13 17 23 13 13 13 13 13 13 13    3
    4 14 14 24 34 55 14 14 14 14 14    4
    5 18 18 18 18 18 18 18 18 18 18    5
    

    x DF,我有变量 A F y DF并实现一个简单的回归: lm(A ~ F, data = y) ,并将结果保存在列表的第一个位置。我也会对第二排的 DF实现回归 lm(B ~ G, data = y)

    如何匹配中的变量名 数据输入 是的 为了回归?


    更复杂的回归如何 Var1 ~ Var2 + Var3

    1 回复  |  直到 6 年前
        1
  •  1
  •   Zheyuan Li    6 年前
    x = data.frame(Var1= c("A", "B", "C", "D","E"),
                   Var2=c("F","G","H","I","J"),
                   Value= c(11, 12, 13, 14,18))
    
    y = data.frame(A= c(11, 12, 13, 14,18),
                   B= c(15, 16, 17, 14,18),
                   C= c(17, 22, 23, 24,18),
                   D= c(11, 12, 13, 34,18),
                   E= c(11, 5, 13, 55,18),
                   F= c(8, 12, 13, 14,18),
                   G= c(7, 5, 13, 14,18),
                   H= c(8, 12, 13, 14,18), 
                   I= c(9, 5, 13, 14,18),
                   J= c(11, 12, 13, 14,18))
    

    我们可以利用

    fitmodel <- function (RHS, LHS) do.call("lm", list(formula = reformulate(RHS, LHS),
                                                  data = quote(y)))
    
    modList <- Map(fitmodel, as.character(x$Var2), as.character(x$Var1))
    
    modList[[1]]  ## for example
    #Call:
    #lm(formula = A ~ F, data = y)
    #
    #Coefficients:
    #(Intercept)            F  
    #     4.3500       0.7115  
    

    评论:

    1. 使用 do.call reformulate 传递给时计算 lm . 这是需要的,因为它允许像 update Showing string in formula and not as variable in lm fit . 作为比较:

      oo <- Map(function (RHS, LHS) lm(reformulate(RHS, LHS), data = y),
                as.character(x$Var2), as.character(x$Var1))
      oo[[1]]
      #Call:
      #lm(formula = reformulate(RHS, LHS), data = y)
      #
      #Coefficients:
      #(Intercept)            F  
      #     4.3500       0.7115  
      
    2. 这个 as.character x$Var1 x$Var2 重新制定 不能用。如果你把 stringsAsFactors = FALSE 在里面 data.frame 当你建立你的 x

    对你有用吗?它不应该有一个“for”循环?

    这个 Map 函数隐藏“for”循环。它是 mapply *apply R中的族函数是 a syntactic sugar .


    你最初的问题是构造一个模型公式 Var1 ~ Var2 .

    你的新问题是 Var1 ~ Var2 + Var3 .

    x$Var3 <- rep("time", each=length(x$Var1))
    y$time <- seq(1:length(y[,1]))
    
    ## collect multiple RHS variables (using concatenation function `c`)
    RHS <- Map(base::c, as.character(x$Var2), as.character(x$Var3))
    #str(RHS)
    #List of 5  ## oh this list has names! annoying!!
    # $ F: chr [1:2] "F" "time"
    # $ G: chr [1:2] "G" "time"
    # $ H: chr [1:2] "H" "time"
    # $ I: chr [1:2] "I" "time"
    # $ J: chr [1:2] "J" "time"
    LHS <- as.character(x$Var1)
    modList <- Map(fitmodel, RHS, LHS)  ## `fitmodel` function unchanged
    modList[[1]]  ## for example
    #Call:
    #lm(formula = A ~ F + time, data = y)
    #
    #Coefficients:
    #(Intercept)            F         time  
    #        5.6          0.5          0.5