我的SQL Server 2017数据库中有一个表,其中部分包含以下数据:
我的目的是为19列中的每一列创建多元多项式回归,其中likingorder是我的因变量,给定respid的19列值中的每一列都是自变量。
最终结果应该是每个respid的每列c1到c19的最高回归值。最终结果应该是这样的:
我读过有关polym的文章,并尝试在下面的脚本中使用它:
alter procedure[dbo]。[sprevisionpeak]
@研究国际
AS
开始
声明@sstudyid varchar(50)
set@sstudyid=convert(varchar(50),@studyid)
--我们使用isNull值传递零,其中平均值不是计算的OS
多项式回归可以计算出来。
声明@inquery为nvarchar(max)='
选择
C.studyid,C.respid,C.likingorder,avg(c1)为c1,avg(c2)为c2,avg(c3)为
c3,avg(c4)为c4,avg(c5)为c5,avg(c6)为c6,avg(c7)为c7,avg(c8)为
C8,Avg(C9)为C9,Avg(C10)为C10,
平均(c11)为c11,平均(c12)为c12,平均(c13)为c13,平均(c14)为c14,平均(c15)
作为c15,avg(c16)作为c16,avg(c17)作为c17,avg(isnull(c18,0))作为c18,avg(c19)
作为C19
来自ClosedStudy响应C
其中c.studyid=@studyid
按研究分组ID、Respid、Likeingorder
按respid排序
--我们将@inquery aka inputDataset设置为初始数据集。
--R服务要求将data.frame传递给

ALTER PROCEDURE [dbo].[spRegressionPeak]
@StudyID int
AS
BEGIN
Declare @sStudyID VARCHAR(50)
Set @sStudyID = CONVERT(VARCHAR(50),@StudyID)
--We use IsNull values to pass zeroes where an average wasn't calculated os
that the polynomial regression can be calculated.
DECLARE @inquery AS NVARCHAR(MAX) = '
Select
c.StudyID, c.RespID, c.LikingOrder, avg(C1) as C1, avg(C2) as C2, avg(C3) as
C3, avg(C4) as C4, avg(C5) as C5, avg(C6) as C6, avg(C7) as C7, avg(C8) as
C8, avg(C9) as C9, avg(C10) as C10,
avg(C11) as C11, avg(C12) as C12, avg(C13) as C13, avg(C14) as C14, avg(C15)
as C15, avg(C16) as C16, avg(C17) as C17, avg(isnull(C18,0)) as C18, avg(C19)
as C19
from ClosedStudyResponses c
where c.StudyID = @StudyID
group by StudyID, RespID, LikingOrder
order by RespID
--We are setting @inquery aka InputDataSet to be our initial dataset.
--R Services requires that a data.frame be passed to any calculations being
generated. As such, df is simply data framing the @inquery data.
--The res object holds the polynomial regression results by RespondentID and
LikingOrder for each of the averages in the @inquery resultset.
EXEC sp_execute_external_script @language = N'R'
, @script = N'
studymeans <- InputDataSet
df <- data.frame(studymeans)
res1 <- lm(df$LikingOrder ~ polym(df$c1, df$c2, df$c3, df$c4, df$c5, df$c6, df$c7, df$c8, df$c9,
df$c10, df$c11, df$c12, df$c13, df$c14, df$c15, df$c16, df$c17, df$c18, df$c19, degree = 1, raw = TRUE))
res <- data.frame(res1)
'
, @input_data_1 = @inquery
, @output_data_1_name = N'res'
, @params = N'@StudyID int'
,@StudyID = @StudyID
--- Edit this line to handle the output data frame.
WITH RESULT SETS ((RespID int, res varchar(max)));
END;
Error in model.frame.default(formula = df$LikingOrder ~ polym(df$c1, df$c2,
:
variable lengths differ (found for 'polym(df$c1, df$c2, df$c3, df$c4, df$c5,
df$c6, df$c7, df$c8, df$c9, df$c10, df$c11, df$c12, df$c13, df$c14, df$c15,
df$c16, df$c17, df$c18, df$c19, degree = 1, raw = TRUE)')
Calls: source ... lm -> eval -> eval -> <Anonymous> -> model.frame.default
In addition: There were 19 warnings (use warnings() to see them)