代码之家 › 专栏 › 技术社区 › Satish Chaudhary

Statsmodels ARIMA(0,1,2)结果不同于Stata ARIMA(0,1,2)

arima statsmodels stata python

Satish Chaudhary · 技术社区 · 1 年前

在进行ARIMA分析时,Stata 17的输出和statsmodels的输出不同

当我申请时

    re = ARIMA(df_log, order = (0,1,2))
    print(re`.fit().summary())

the results were as follows 
SARIMAX Results                                
==============================================================================
Dep. Variable:                    GDP   No. Observations:                   62
Model:                 ARIMA(0, 1, 2)   Log Likelihood                  48.459
Date:                Thu, 11 May 2023   AIC                            -90.918
Time:                        02:21:08   BIC                            -84.585
Sample:                    01-01-1960   HQIC                           -88.436
                         - 01-01-2021                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
ma.L1          0.4751      0.142      3.349      0.001       0.197       0.753
ma.L2         -0.0500      0.151     -0.332      0.740      -0.345       0.245
sigma2         0.0119      0.002      6.720      0.000       0.008       0.015
===================================================================================
Ljung-Box (L1) (Q):                   4.11   Jarque-Bera (JB):                 3.62
Prob(Q):                              0.04   Prob(JB):                         0.16
Heteroskedasticity (H):               0.60   Skew:                             0.37
Prob(H) (two-sided):                  0.27   Kurtosis:                         3.94
===================================================================================

然而,当在Stata 17中进行相同的方法时,相同数据的结果如下

arima log_gdp, arima(0,1,2)

(setting optimization to BHHH)
Iteration 0:   log likelihood =  51.833406  
Iteration 1:   log likelihood =  58.219464  
Iteration 2:   log likelihood =  59.750732  
Iteration 3:   log likelihood =  60.128641  
Iteration 4:   log likelihood =  60.183567  
(switching optimization to BFGS)
Iteration 5:   log likelihood =  60.191613  
Iteration 6:   log likelihood =  60.192693  
Iteration 7:   log likelihood =  60.192721  
Iteration 8:   log likelihood =  60.192721  

ARIMA regression

Sample: 1961 thru 2021                          Number of obs     =         61
                                                Wald chi2(2)      =      17.21
Log likelihood = 60.19272                       Prob > chi2       =     0.0002

------------------------------------------------------------------------------
             |                 OPG
   D.log_gdp | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
log_gdp      |
       _cons |   .0707899   .0085723     8.26   0.000     .0539885    .0875912
-------------+----------------------------------------------------------------
ARMA         |
          ma |
         L1. |   .1135653    .103465     1.10   0.272    -.0892223    .3163529
         L2. |  -.4008123   .1129969    -3.55   0.000    -.6222821   -.1793425
-------------+----------------------------------------------------------------
      /sigma |   .0899162    .007283    12.35   0.000     .0756417    .1041907
------------------------------------------------------------------------------
Note: The test of the variance against zero is one sided, and the two-sided
      confidence interval is truncated at zero.

结果是不同的。因此,如果我遗漏了什么,请寻求解释。尽管如此,如果我在统计模型中使用1级的差分数据,但模型=ARIMA(0,0,2),结果是匹配的。这里我使用的是statsmodels verison 0.13.5

re = ARIMA(df_log.diff().dropna(), order = (0,1,2))`
print(re.fit().summary()`

                  SARIMAX Results                                
==============================================================================
Dep. Variable:                    GDP   No. Observations:                   61
Model:                 ARIMA(0, 0, 2)   Log Likelihood                  60.193
Date:                Thu, 11 May 2023   AIC                           -112.386
Time:                        02:14:30   BIC                           -103.942
Sample:                    01-01-1961   HQIC                          -109.076
                         - 01-01-2021                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0708      0.009      8.258      0.000       0.054       0.088
ma.L1          0.1136      0.103      1.098      0.272      -0.089       0.316
ma.L2         -0.4008      0.113     -3.548      0.000      -0.622      -0.179
sigma2         0.0081      0.001      6.174      0.000       0.006       0.011
===================================================================================
Ljung-Box (L1) (Q):                   0.00   Jarque-Bera (JB):                 2.57
Prob(Q):                              0.99   Prob(JB):                         0.28
Heteroskedasticity (H):               0.60   Skew:                             0.36
Prob(H) (two-sided):                  0.26   Kurtosis:                         3.71
===================================================================================

0 回复 | 直到 1 年前

cfulton 1 年前

结果之间的差异是因为当你有一个有差异的模型时,统计模型不会自动包括趋势。您可以看到Stata的结果有一个额外的参数。

如果指定一个具有趋势的模型,则结果非常匹配:

re = ARIMA(df_log, order = (0,1,2), trend='t')
print(re.fit().summary())

给出:

                               SARIMAX Results                                
==============================================================================
Dep. Variable:                  value   No. Observations:                   62
Model:                 ARIMA(0, 1, 2)   Log Likelihood                  59.730
Date:                Fri, 12 May 2023   AIC                           -111.461
Time:                        22:58:31   BIC                           -103.017
Sample:                    12-31-1960   HQIC                          -108.152
                         - 12-31-2021                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
x1             0.0707      0.009      8.270      0.000       0.054       0.087
ma.L1          0.1026      0.106      0.969      0.333      -0.105       0.310
ma.L2         -0.3959      0.112     -3.525      0.000      -0.616      -0.176
sigma2         0.0082      0.001      6.255      0.000       0.006       0.011
===================================================================================
Ljung-Box (L1) (Q):                   0.00   Jarque-Bera (JB):                 2.42
Prob(Q):                              0.98   Prob(JB):                         0.30
Heteroskedasticity (H):               0.58   Skew:                             0.31
Prob(H) (two-sided):                  0.22   Kurtosis:                         3.76
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).