代码之家  ›  专栏  ›  技术社区  ›  SidC

如何从SQL Server R存储过程返回列名和摘要值?

  •  2
  • SidC  · 技术社区  · 6 年前

    我创建了一个存储过程,它使用 dplyr 用于分组的库 StudyID ProductNumber . 我想返回字段c1-c8中每个值的平均值,以及相同字段的标准差。

    我的存储过程如下:

    ALTER PROCEDURE [dbo].[spCodeMeans]
        @StudyID INT,
        @StudyID_outer INT OUT
    AS
    BEGIN
        SET NOCOUNT ON;
    
        DECLARE @inquery NVARCHAR(MAX) = N'Select
            c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue, 
            c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
            c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
            c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1, 
            c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
            from ClosedStudyResponses c
            --Sensory Value Attributes only for mean and standard deviation analytics.
            where VariableAttributeID = 1
            and c.StudyID = 22'
            ;
    
        BEGIN TRY
            EXEC sp_execute_external_script
                    @language = N'R',
                    @script = N'
            library(dplyr)
                OutputDataSet <- InputDataSet %>%
                    group_by (StudyID, ProductNumber) %>%
                    summarise(c1_mean = mean(c1), c2_mean = mean(c2), c3_mean = mean(c3), c4_mean = mean(c4), c5_mean = mean(c5), c6_mean = mean(c6), 
                    c7_mean = mean(c7), c8_mean = mean(c8), c1_sd = sd(c1), c2_sd = sd(c2), c3_sd = sd(c3), c4_sd = sd(c4), c5_sd = sd(c5), c6_sd = sd(c6), 
                    c7_sd = sd(c7), c8_sd = sd(c8)) %>%
                `colnames<-`(c("StudyID", "ProductNumber","c1_mean","c2_mean","c3_mean","c4_mean","c5_mean","c6_mean","c7_mean",
                "c8_mean","c1_sd","c2_sd","c3_sd","c4_sd","c5_sd","c6_sd","c7_sd","c8_sd"))',
                    @input_data_1 = @inquery
        END TRY
        BEGIN CATCH
            THROW;
        END CATCH
    

    结果不包括列名,也不包括字段c1到c8的平均值和标准偏差。如何调整语法才能完成此任务?

    更新: 根据建议,我修改了存储过程,将结果插入到临时表中。修改后的语法如下:

    ALTER PROCEDURE [dbo].[spCodeMeans]
    -- Add the parameters for the stored procedure here
    @StudyID int,
    @StudyID_outer int OUT
    
    
    AS
    BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;
    
    --Create temptable to store the outputdataset
    Create table #temp_table (
        StudyID int,
        ProductNumber int,
        c1_mean decimal,
        c2_mean decimal,
        c3_mean decimal,
        c4_mean decimal,
        c5_mean decimal,
        c6_mean decimal,
        c7_mean decimal,
        c8_mean decimal,
    
        c1_sd decimal,
        c2_sd decimal,
        c3_sd decimal,
        c4_sd decimal,
        c5_sd decimal,
        c6_sd decimal,
        c7_sd decimal,
        c8_sd decimal
    );
    
    -- Insert statements for procedure here
    Declare @inquery nvarchar(max) = N'Select
            c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue, 
            c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
            c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
            c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1, 
            c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
            from ClosedStudyResponses c
            --Sensory Value Attributes only for mean and standard deviation analytics.
            where VariableAttributeID = 1
            and c.StudyID = 22'
            ;
     BEGIN TRY
            Insert into #temp_table
            exec sp_execute_external_script
            @language = N'R',
            @script = N'
            library(dplyr)
                OutputDataSet <- InputDataSet %>%
                    group_by (StudyID, ProductNumber) %>%
                    summarise_all(.funs=c(mean, sd)) %>%
                    setNames(c("StudyID","ProductNumber",
                    paste0("c",1:8, "_mean"),
                    paste0("c",1:8, "_sd")))
                ',
    @input_data_1 = @inquery,
    @output_data_1 = N'OutputDataSet';
    
    END TRY
    
    BEGIN CATCH
        THROW;
    END CATCH
    
    Select * from #temp_table;
    END
    

    当我试图运行该过程时,我收到一个错误,它指出 Procedure expects parameter '@params' of type 'ntext/nchar/nvarchar'. 注意,我已经将@inquery声明为nvarchar(max)。我又忽略了一步吗?

    更新2: 我花了一些时间重新处理存储过程,发现输出需要采用data.frame格式。我相应地修改了存储过程的r部分。现在我的列名出现了,但是没有返回数据、平均值或标准差。当前存储过程如下:

    ALTER PROCEDURE [dbo].[spCodeMeans]
    -- Add the parameters for the stored procedure here
    @StudyID int
    
    AS
    BEGIN
    
    
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;
    
    
    
    -- Insert statements for procedure here
    Declare @sStudy varchar(50)
    Set @sStudy = Convert(Varchar(50),@StudyID)
    Declare @inquery nvarchar(max) = N'Select
            c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue, 
            c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
            c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
            c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1, 
            c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
            from ClosedStudyResponses c
            --Sensory Value Attributes only for mean and standard deviation analytics.
            where VariableAttributeID = 1
            and c.StudyID =' +@sStudy ;
    
    BEGIN TRY
            --Insert into CodeMeans
            exec sp_execute_external_script
            @language = N'R',
            @script = N'
            library(dplyr)
            codemeans <- function(StudyID){
                res <- InputDataSet %>%
                    group_by (StudyID, ProductNumber) %>%
                    summarise_all(.funs=c(mean, sd)) %>%
                    setNames(c("StudyID","ProductNumber",
                    paste0("c",1:8, "_mean"),
                    paste0("c",1:8, "_sd")))
                df <- data.frame(res)
                }
                ',
    @input_data_1 = @inquery,
    @output_data_1_name = N'df',
    @params = N'@StudyID int',
    @StudyID = @StudyID
    
    END TRY
    
    BEGIN CATCH
        THROW;
    END CATCH
    
    Select * from CodeMeans;
    END
    

    因此,在这一点上,我正在寻求如何从基本查询返回输出以及均值和标准差的指导。

    1 回复  |  直到 6 年前
        1
  •  0
  •   Parfait    6 年前

    目前,您没有定义输出集,也没有将返回插入到预定义表。使用临时表考虑以下调整(根据需要调整下面的类型)。也可以考虑使用 dplyr::summarise_all() 对于多个聚合调用, setNames 保持管道的流动,以及 paste 避免名字重复。

    ALTER PROCEDURE [dbo].[spCodeMeans]
    -- Add the parameters for the stored procedure here
    @StudyID int,
    @StudyID_outer int OUT
    
    AS
    BEGIN
       -- SET NOCOUNT ON added to prevent extra result sets from
       -- interfering with SELECT statements.
       SET NOCOUNT ON;
    
    
       CREATE TABLE #tmp_table -- LOCAL TEMP TABLE (EXISTS ONLY IN SESSION)
       (
            StudyID INTEGER,
            ProductNumber VARCHAR(50),
            c1_mean DECIMAL(10,2), c2_mean DECIMAL(10,2), c3_mean DECIMAL(10,2), c4_mean DECIMAL(10,2),
            c5_mean DECIMAL(10,2), c6_mean DECIMAL(10,2), c7_mean DECIMAL(10,2), c8_mean DECIMAL(10,2),
            c1_sd DECIMAL(10,2), c2_sd DECIMAL(10,2), c3_sd DECIMAL(10,2), c4_sd DECIMAL(10,2),
            c5_sd DECIMAL(10,2), c6_sd DECIMAL(10,2), c7_sd DECIMAL(10,2), c8_sd DECIMAL(10,2)
       );
    
       DECLARE @inquery nvarchar(max) = -- ...same as before...
    
       BEGIN TRY
            -- ACTION QUERY
            INSERT INTO @temp_table
            EXEC sp_execute_external_script
            @language = N'R',
            @script = N'
            library(dplyr)
              OutputDataSet <- InputDataSet %>%
                group_by (StudyID, ProductNumber) %>%
                summarise_all(.funs=c(mean, sd)) %>%
                setNames(c("StudyID", "ProductNumber", 
                           paste0("c", 1:8, "_mean"), 
                           paste0("c", 1:8, "_sd")))',
            @input_data_1 = @inquery,
            @output_data_1 = N'OutputDataSet';
    
       END TRY
    
       BEGIN CATCH
           THROW;
       END CATCH
    
       -- RESULTSET QUERY
       SELECT * FROM @temp_table;
    
    END