代码之家  ›  专栏  ›  技术社区  ›  user3104352

SARIMAX蟒蛇np.linalg.linalg.LinAlgError:LU分解错误

  •  0
  • user3104352  · 技术社区  · 6 年前

    我对时间序列分析有问题。我有一个具有5个特征的数据集。以下是我的输入数据集的子集:

    date,price,year,day,totaltx
    1/1/2016 0:00,434.46,2016,1,126762
    1/2/2016 0:00,433.59,2016,2,147449
    1/3/2016 0:00,430.36,2016,3,148661
    1/4/2016 0:00,433.49,2016,4,185279
    1/5/2016 0:00,432.25,2016,5,178723
    1/6/2016 0:00,429.46,2016,6,184207
    

    我的内生数据是价格列,外生数据是总价格。

    这是我正在运行的代码,出现了一个错误:

    import statsmodels.api as sm
    import pandas as pd
    import numpy as np
    from numpy.linalg import LinAlgError
    
    def arima(filteredData, coinOutput, window, horizon, trainLength):
        start_index = 0
        end_index = 0
        inputNumber = filteredData.shape[0]
        predictions = np.array([], dtype=np.float32)
        prices = np.array([], dtype=np.float32)
        # sliding on time series data with 1 day step
        while ((end_index) < inputNumber - 1):
            end_index = start_index + trainLength
            trainFeatures = filteredData[start_index:end_index]["totaltx"]
            trainOutput = coinOutput[start_index:end_index]["price"]
    
            arima = sm.tsa.statespace.SARIMAX(endog=trainOutput.values, exog=trainFeatures.values, order=(window, 0, 0))
            arima_fit = arima.fit(disp=0)
            testdata=filteredData[end_index:end_index+1]["totaltx"]
            total_sample = end_index-start_index
            predicted = arima_fit.predict(start=total_sample, end=total_sample, exog=np.array(testdata.values).reshape(-1,1))
            price = coinOutput[end_index:end_index + 1]["price"].values
    
            predictions = np.append(predictions, predicted)
            prices = np.append(prices, price)
    
            start_index = start_index + 1
        return predictions, prices
    
    def processCoins(bitcoinPrice, window, horizon):
        output = bitcoinPrice[horizon:][["date", "day", "year", "price"]]
        return output
    
    trainLength=100;
    for window in [3,5]:
        for horizon in [1,2,5,7,10]:
            bitcoinPrice = pd.read_csv("..\\prices.csv", sep=",")
            coinOutput = processCoins(bitcoinPrice, window, horizon)
            predictions, prices = arima(bitcoinPrice, coinOutput, window, horizon, trainLength)
    

    在这段代码中,我使用了滚动窗口回归技术。我在训练阿里玛 start_index:end_index 并对试验数据进行预测 end_index:end_index+1

    这是从我的代码中引发的错误:

    Traceback (most recent call last):
      File "C:/PycharmProjects/coinLogPrediction/src/arima.py", line 115, in <module>
        predictions, prices = arima(filteredBitcoinPrice, coinOutput, window, horizon, trainLength, outputFile)
      File "C:/PycharmProjects/coinLogPrediction/src/arima.py", line 64, in arima
        arima_fit = arima.fit(disp=0)
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\tsa\statespace\mlemodel.py", line 469, in fit
        skip_hessian=True, **kwargs)
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\base\model.py", line 466, in fit
        full_output=full_output)
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\base\optimizer.py", line 191, in _fit
        hess=hessian)
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\base\optimizer.py", line 410, in _fit_lbfgs
        **extra_kwargs)
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\scipy\optimize\lbfgsb.py", line 193, in fmin_l_bfgs_b
        **opts)
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\scipy\optimize\lbfgsb.py", line 328, in _minimize_lbfgsb
        f, g = func_and_grad(x)
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\scipy\optimize\lbfgsb.py", line 273, in func_and_grad
        f = fun(x, *args)
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\scipy\optimize\optimize.py", line 292, in function_wrapper
        return function(*(wrapper_args + args))
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\base\model.py", line 440, in f
        return -self.loglike(params, *args) / nobs
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\tsa\statespace\mlemodel.py", line 646, in loglike
        loglike = self.ssm.loglike(complex_step=complex_step, **kwargs)
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\tsa\statespace\kalman_filter.py", line 825, in loglike
        kfilter = self._filter(**kwargs)
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\tsa\statespace\kalman_filter.py", line 747, in _filter
        self._initialize_state(prefix=prefix, complex_step=complex_step)
      File "C:\AppData\Local\Continuum\Anaconda3\lib\site-packages\statsmodels\tsa\statespace\representation.py", line 723, in _initialize_state
        self._statespaces[prefix].initialize_stationary(complex_step)
      File "_representation.pyx", line 1351, in statsmodels.tsa.statespace._representation.dStatespace.initialize_stationary
      File "_tools.pyx", line 1151, in statsmodels.tsa.statespace._tools._dsolve_discrete_lyapunov
    numpy.linalg.linalg.LinAlgError: LU decomposition error.
    
    1 回复  |  直到 6 年前
        1
  •  1
  •   cfulton    6 年前

    这看起来像是个窃听器。同时,您可以通过使用不同的初始化来解决此问题,如下所示:

    arima = sm.tsa.statespace.SARIMAX(
        endog=trainOutput.values, exog=trainFeatures.values, order=(window, 0, 0),
        initialization='approximate_diffuse')
    

    https://github.com/statsmodels/statsmodels/issues/new !