代码之家  ›  专栏  ›  技术社区  ›  Satish Chaudhary

Statsmodels ARIMA(0,1,2)结果不同于Stata ARIMA(0,1,2)

  •  0
  • Satish Chaudhary  · 技术社区  · 1 年前

    在进行ARIMA分析时,Stata 17的输出和statsmodels的输出不同

    当我申请时

        re = ARIMA(df_log, order = (0,1,2))
        print(re`.fit().summary())
    
    the results were as follows 
    SARIMAX Results                                
    ==============================================================================
    Dep. Variable:                    GDP   No. Observations:                   62
    Model:                 ARIMA(0, 1, 2)   Log Likelihood                  48.459
    Date:                Thu, 11 May 2023   AIC                            -90.918
    Time:                        02:21:08   BIC                            -84.585
    Sample:                    01-01-1960   HQIC                           -88.436
                             - 01-01-2021                                         
    Covariance Type:                  opg                                         
    ==============================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
    ------------------------------------------------------------------------------
    ma.L1          0.4751      0.142      3.349      0.001       0.197       0.753
    ma.L2         -0.0500      0.151     -0.332      0.740      -0.345       0.245
    sigma2         0.0119      0.002      6.720      0.000       0.008       0.015
    ===================================================================================
    Ljung-Box (L1) (Q):                   4.11   Jarque-Bera (JB):                 3.62
    Prob(Q):                              0.04   Prob(JB):                         0.16
    Heteroskedasticity (H):               0.60   Skew:                             0.37
    Prob(H) (two-sided):                  0.27   Kurtosis:                         3.94
    ===================================================================================
    

    然而,当在Stata 17中进行相同的方法时,相同数据的结果如下

    arima log_gdp, arima(0,1,2)

    (setting optimization to BHHH)
    Iteration 0:   log likelihood =  51.833406  
    Iteration 1:   log likelihood =  58.219464  
    Iteration 2:   log likelihood =  59.750732  
    Iteration 3:   log likelihood =  60.128641  
    Iteration 4:   log likelihood =  60.183567  
    (switching optimization to BFGS)
    Iteration 5:   log likelihood =  60.191613  
    Iteration 6:   log likelihood =  60.192693  
    Iteration 7:   log likelihood =  60.192721  
    Iteration 8:   log likelihood =  60.192721  
    
    ARIMA regression
    
    Sample: 1961 thru 2021                          Number of obs     =         61
                                                    Wald chi2(2)      =      17.21
    Log likelihood = 60.19272                       Prob > chi2       =     0.0002
    
    ------------------------------------------------------------------------------
                 |                 OPG
       D.log_gdp | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
    log_gdp      |
           _cons |   .0707899   .0085723     8.26   0.000     .0539885    .0875912
    -------------+----------------------------------------------------------------
    ARMA         |
              ma |
             L1. |   .1135653    .103465     1.10   0.272    -.0892223    .3163529
             L2. |  -.4008123   .1129969    -3.55   0.000    -.6222821   -.1793425
    -------------+----------------------------------------------------------------
          /sigma |   .0899162    .007283    12.35   0.000     .0756417    .1041907
    ------------------------------------------------------------------------------
    Note: The test of the variance against zero is one sided, and the two-sided
          confidence interval is truncated at zero.
    

    结果是不同的。因此,如果我遗漏了什么,请寻求解释。尽管如此,如果我在统计模型中使用1级的差分数据,但模型=ARIMA(0,0,2),结果是匹配的。这里我使用的是statsmodels verison 0.13.5

    re = ARIMA(df_log.diff().dropna(), order = (0,1,2))`
    print(re.fit().summary()`
    
                      SARIMAX Results                                
    ==============================================================================
    Dep. Variable:                    GDP   No. Observations:                   61
    Model:                 ARIMA(0, 0, 2)   Log Likelihood                  60.193
    Date:                Thu, 11 May 2023   AIC                           -112.386
    Time:                        02:14:30   BIC                           -103.942
    Sample:                    01-01-1961   HQIC                          -109.076
                             - 01-01-2021                                         
    Covariance Type:                  opg                                         
    ==============================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
    ------------------------------------------------------------------------------
    const          0.0708      0.009      8.258      0.000       0.054       0.088
    ma.L1          0.1136      0.103      1.098      0.272      -0.089       0.316
    ma.L2         -0.4008      0.113     -3.548      0.000      -0.622      -0.179
    sigma2         0.0081      0.001      6.174      0.000       0.006       0.011
    ===================================================================================
    Ljung-Box (L1) (Q):                   0.00   Jarque-Bera (JB):                 2.57
    Prob(Q):                              0.99   Prob(JB):                         0.28
    Heteroskedasticity (H):               0.60   Skew:                             0.36
    Prob(H) (two-sided):                  0.26   Kurtosis:                         3.71
    ===================================================================================
    
    0 回复  |  直到 1 年前
        1
  •  1
  •   cfulton    1 年前

    结果之间的差异是因为当你有一个有差异的模型时,统计模型不会自动包括趋势。您可以看到Stata的结果有一个额外的参数。

    如果指定一个具有趋势的模型,则结果非常匹配:

    re = ARIMA(df_log, order = (0,1,2), trend='t')
    print(re.fit().summary())
    

    给出:

                                   SARIMAX Results                                
    ==============================================================================
    Dep. Variable:                  value   No. Observations:                   62
    Model:                 ARIMA(0, 1, 2)   Log Likelihood                  59.730
    Date:                Fri, 12 May 2023   AIC                           -111.461
    Time:                        22:58:31   BIC                           -103.017
    Sample:                    12-31-1960   HQIC                          -108.152
                             - 12-31-2021                                         
    Covariance Type:                  opg                                         
    ==============================================================================
                     coef    std err          z      P>|z|      [0.025      0.975]
    ------------------------------------------------------------------------------
    x1             0.0707      0.009      8.270      0.000       0.054       0.087
    ma.L1          0.1026      0.106      0.969      0.333      -0.105       0.310
    ma.L2         -0.3959      0.112     -3.525      0.000      -0.616      -0.176
    sigma2         0.0082      0.001      6.255      0.000       0.006       0.011
    ===================================================================================
    Ljung-Box (L1) (Q):                   0.00   Jarque-Bera (JB):                 2.42
    Prob(Q):                              0.98   Prob(JB):                         0.30
    Heteroskedasticity (H):               0.58   Skew:                             0.31
    Prob(H) (two-sided):                  0.22   Kurtosis:                         3.76
    ===================================================================================
    
    Warnings:
    [1] Covariance matrix calculated using the outer product of gradients (complex-step).