代码之家  ›  专栏  ›  技术社区  ›  Rafael Díaz

仅伽马回归截距

  •  3
  • Rafael Díaz  · 技术社区  · 6 年前

    我刚接触过python,我正试图做一个gamma回归,我希望得到与r类似的估计,但是我不理解python的语法,它会产生一个错误,对于如何解决它有一些想法。

    我的R代码:

    set.seed(1)
    y = rgamma(18,10,.1)
    print(y)
    [1]  76.67251 140.40808 138.26660 108.20993  53.46417 110.61754 119.11950 113.57558  85.82045  71.96892
    [11]  76.81693  86.00139  93.62010  69.49795 121.99775 114.18707 125.43608 120.63640
    
    # Option 1
    model = glm(y~1,family=Gamma)
    summary(model)
    
    # Option 2
    # x = rep(1,18)
    # summary(glm(y~x,family=Gamma))
    

    输出:

    summary(model)
    
    Call:
    glm(formula = y ~ 1, family = Gamma)
    
    Deviance Residuals: 
         Min        1Q    Median        3Q       Max  
    -0.57898  -0.24017   0.07637   0.17489   0.34345  
    
    Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
    (Intercept) 0.009856   0.000581   16.96 4.33e-12 ***
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    
    (Dispersion parameter for Gamma family taken to be 0.06255708)
    
        Null deviance: 1.1761  on 17  degrees of freedom
    Residual deviance: 1.1761  on 17  degrees of freedom
    AIC: 171.3
    
    Number of Fisher Scoring iterations: 4
    

    python代码

    y = [76.67251,140.40808,138.26660,108.20993,53.46417,110.61754,
     119.11950,113.57558,85.82045,71.96892,76.81693,86.00139,
     93.62010,69.49795,121.99775,114.18707,125.43608,120.63640]
    
    x = np.repeat(1,18)
    
    import numpy
    import statsmodels.api as sm
    
    model = sm.GLM(x,y, family=sm.families.Gamma()).fit()
    print(model.summary())
    

    我期望输出类似于r

    2 回复  |  直到 6 年前
        1
  •  3
  •   Katia    6 年前

    您需要更改python代码中x和y变量的顺序,然后您将看到完全相同的结果(尽管输出中显示的有效位数与r中的输出不同:

     sm.GLM(y,x, family=sm.families.Gamma()).fit().summary()
    
    <class 'statsmodels.iolib.summary.Summary'>
    """
                     Generalized Linear Model Regression Results
    ==============================================================================
    Dep. Variable:                      y   No. Observations:                   18
    Model:                            GLM   Df Residuals:                       17
    Model Family:                   Gamma   Df Model:                            0
    Link Function:          inverse_power   Scale:                 0.0625558699706
    Method:                          IRLS   Log-Likelihood:                -83.656
    Date:                Sun, 20 May 2018   Deviance:                       1.1761
    Time:                        17:59:04   Pearson chi2:                     1.06
    No. Iterations:                     4
    ==============================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
    ------------------------------------------------------------------------------
    const          0.0099      0.001     16.963      0.000       0.009       0.011
    ==============================================================================
    """
    

    各种Python包都有自己的语法。下面是一个很好的链接,其中包含一些如何在python中使用公式语法的示例: http://www.statsmodels.org/dev/example_formulas.html enter link description here

        2
  •  1
  •   Rafael Díaz    6 年前

    这是另一种使用公式的方法,因为您需要导入 statsmodels.formula.api

    import pandas as pd
    import statsmodels.api as sm
    import statsmodels.formula.api as smf
    
    y = [76.67251,140.40808,138.26660,108.20993,53.46417,110.61754,
     119.11950,113.57558,85.82045,71.96892,76.81693,86.00139,
     93.62010,69.49795,121.99775,114.18707,125.43608,120.63640]
    
    df = pd.DataFrame({'y':y})
    
    model = smf.glm(formula = 'y ~ 1', data = df, family=sm.families.Gamma()).fit()
    model.summary()
    <class 'statsmodels.iolib.summary.Summary'>
    """
                     Generalized Linear Model Regression Results                  
    ==============================================================================
    Dep. Variable:                      y   No. Observations:                   18
    Model:                            GLM   Df Residuals:                       17
    Model Family:                   Gamma   Df Model:                            0
    Link Function:          inverse_power   Scale:                        0.062556
    Method:                          IRLS   Log-Likelihood:                -83.656
    Date:                Sun, 20 May 2018   Deviance:                       1.1761
    Time:                        22:00:54   Pearson chi2:                     1.06
    No. Iterations:                     6   Covariance Type:             nonrobust
    ==============================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
    ------------------------------------------------------------------------------
    Intercept      0.0099      0.001     16.963      0.000       0.009       0.011
    ==============================================================================
    """