代码之家  ›  专栏  ›  技术社区  ›  SidC

多元多项式回归中如何用R求最大值(回归峰)?

lm r
  •  0
  • SidC  · 技术社区  · 6 年前

    我的SQL Server 2017数据库中有一个表,其中部分包含以下数据:

    我的目的是为19列中的每一列创建多元多项式回归,其中likingorder是我的因变量,给定respid的19列值中的每一列都是自变量。

    最终结果应该是每个respid的每列c1到c19的最高回归值。最终结果应该是这样的:

    我读过有关polym的文章,并尝试在下面的脚本中使用它:

    alter procedure[dbo]。[sprevisionpeak]
    @研究国际
    AS
    开始
    声明@sstudyid varchar(50)
    set@sstudyid=convert(varchar(50),@studyid)
    
    --我们使用isNull值传递零,其中平均值不是计算的OS
    多项式回归可以计算出来。
    声明@inquery为nvarchar(max)='
    选择
    C.studyid,C.respid,C.likingorder,avg(c1)为c1,avg(c2)为c2,avg(c3)为
    c3,avg(c4)为c4,avg(c5)为c5,avg(c6)为c6,avg(c7)为c7,avg(c8)为
    C8,Avg(C9)为C9,Avg(C10)为C10,
    平均(c11)为c11,平均(c12)为c12,平均(c13)为c13,平均(c14)为c14,平均(c15)
    作为c15,avg(c16)作为c16,avg(c17)作为c17,avg(isnull(c18,0))作为c18,avg(c19)
    作为C19
    来自ClosedStudy响应C
    其中c.studyid=@studyid
    按研究分组ID、Respid、Likeingorder
    按respid排序
    
    --我们将@inquery aka inputDataset设置为初始数据集。
    --R服务要求将data.frame传递给
    
    
    
    
    
    enter image description here

    enter image description here

    ALTER PROCEDURE [dbo].[spRegressionPeak]   
    @StudyID int
    AS
    BEGIN
    Declare @sStudyID VARCHAR(50)
    Set @sStudyID = CONVERT(VARCHAR(50),@StudyID)
    
    --We use IsNull values to pass zeroes where an average wasn't calculated os 
    that the polynomial regression can be calculated.
    DECLARE @inquery  AS NVARCHAR(MAX) = '
        Select
    c.StudyID, c.RespID, c.LikingOrder, avg(C1) as C1, avg(C2) as C2, avg(C3) as 
    C3, avg(C4) as C4, avg(C5) as C5, avg(C6) as C6, avg(C7) as C7, avg(C8) as 
    C8, avg(C9) as C9, avg(C10) as C10,
    avg(C11) as C11, avg(C12) as C12, avg(C13) as C13, avg(C14) as C14, avg(C15) 
    as C15, avg(C16) as C16, avg(C17) as C17, avg(isnull(C18,0)) as C18, avg(C19) 
    as C19
    from ClosedStudyResponses c
    where c.StudyID = @StudyID
    group by StudyID, RespID, LikingOrder
    order by RespID 
    
    --We are setting @inquery aka InputDataSet to be our initial dataset.  
    --R Services requires that a data.frame be passed to any calculations being 
    generated.  As such, df is simply data framing the @inquery data.
    --The res object holds the polynomial regression results by RespondentID and 
    LikingOrder for each of the averages in the @inquery resultset.
    EXEC sp_execute_external_script @language = N'R'
    , @script = N'
        studymeans <- InputDataSet
    
        df <- data.frame(studymeans) 
    
        res1 <- lm(df$LikingOrder ~ polym(df$c1, df$c2, df$c3, df$c4, df$c5, df$c6, df$c7, df$c8, df$c9, 
        df$c10, df$c11, df$c12, df$c13, df$c14, df$c15, df$c16, df$c17, df$c18, df$c19, degree = 1, raw = TRUE)) 
        res <- data.frame(res1)
    
    '
    , @input_data_1 = @inquery
    , @output_data_1_name = N'res'
    , @params = N'@StudyID int'
    ,@StudyID = @StudyID 
    --- Edit this line to handle the output data frame.
    WITH RESULT SETS ((RespID int, res varchar(max)));
    END;
    

    Error in model.frame.default(formula = df$LikingOrder ~ polym(df$c1, df$c2,  
    : 
    variable lengths differ (found for 'polym(df$c1, df$c2, df$c3, df$c4, df$c5, 
    df$c6, df$c7, df$c8, df$c9, df$c10, df$c11, df$c12, df$c13, df$c14, df$c15, 
    df$c16, df$c17, df$c18, df$c19, degree = 1, raw = TRUE)')
    Calls: source ... lm -> eval -> eval -> <Anonymous> -> model.frame.default
    In addition: There were 19 warnings (use warnings() to see them)
    

    1 回复  |  直到 6 年前
        1
  •  1
  •   Nilesh Ingle    6 年前
  • StudyID
  • LinkingOrder response
  • C1 to C19 independent variables

  • Objective linear fit C1 C19

  • Note polynomial fit
  • Resource ISLR
  • StudyID <- rep(10001, 100)
    RespID <- c(rep(117,25), rep(119,25), rep(120,25), rep(121,25))
    LinkingOrder <- floor(runif(100, 1, 9))
    df <- data.frame(StudyID, RespID, LinkingOrder)
    # Create columns C1 to C19
    for (i in c(1:19)){
      vari <- paste("C", i, sep = "")
      df[vari] <-  floor(runif(100, 0, 9))
    }
    
    # Convert RespID to categorical variable
    df$RespID <- as.factor(RespID)
    

    enter image description here

    # Fit lm() and store coefficients in a table
    final_table <- data.frame()
    for (respid in unique(df$RespID)){
      data <- df[df['RespID']==respid, ]
      data <- subset(data, select = -c(StudyID, RespID))
    
      lm.fit <- lm(LinkingOrder ~ ., data=data)
    
      # Save to table
      final_table <- rbind(final_table, data.frame(t(unlist(lm.fit$coefficients))))
    }
    

    enter image description here