代码之家  ›  专栏  ›  技术社区  ›  ap21

用单变量样条拟合数据

  •  2
  • ap21  · 技术社区  · 6 年前

    我有一堆x,y点代表一个sigmoid函数:

    x=[ 1.00094909  1.08787635  1.17481363  1.2617564   1.34867881  1.43562284
      1.52259341  1.609522    1.69631283  1.78276102  1.86426648  1.92896789
      1.9464453   1.94941586  2.00062852  2.073691    2.14982808  2.22808316
      2.30634034  2.38456905  2.46280126  2.54106611  2.6193345   2.69748825]
    y=[-0.10057627 -0.10172142 -0.10320428 -0.10378959 -0.10348456 -0.10312503
     -0.10276956 -0.10170055 -0.09778279 -0.08608644 -0.05797392  0.00063599
      0.08732999  0.16429878  0.2223306   0.25368884  0.26830932  0.27313931
      0.27308756  0.27048902  0.26626313  0.26139534  0.25634544  0.2509893 ]
    

    Data

    scipy.interpolate.UnivariateSpline() 按如下方式拟合某些三次样条曲线:

    from scipy.interpolate import UnivariateSpline
    s = UnivariateSpline(x, y, k=3, s=0)
    
    xfit = np.linspace(x.min(), x.max(), 200)
    plt.scatter(x,y)
    plt.plot(xfit, s(xfit))
    plt.show()
    

    这就是我得到的: Fit

    既然我指定了 s=0 k 价值导致更多的摇摆。

    1. 如何正确使用 符合我的数据?更准确地说,如何使样条曲线最小化其摆动?
    2. 对于这种sigmoidal函数,这是正确的选择吗?我应该用这样的东西吗 scipy.optimize.curve_fit() tanh(x) 而不是功能?
    2 回复  |  直到 6 年前
        1
  •  1
  •   James Phillips    6 年前

    这说明了将两部分数据拟合到不同函数的结果,下半部分用X<2.0和上半部分到所有数据,X>=1.9,以便拟合曲线的数据存在重叠。代码在重叠区域的中心从一个方程切换到另一个方程,X=1.95。

    combined_model.png

    import numpy, matplotlib
    import matplotlib.pyplot as plt
    
    xData=numpy.array([ 1.00094909,  1.08787635,  1.17481363,  1.2617564,   1.34867881,  1.43562284,
      1.52259341,  1.609522,    1.69631283,  1.78276102,  1.86426648,  1.92896789,
      1.9464453,   1.94941586,  2.00062852,  2.073691,    2.14982808,  2.22808316,
      2.30634034,  2.38456905,  2.46280126,  2.54106611,  2.6193345,   2.69748825])
    yData=numpy.array([-0.10057627, -0.10172142, -0.10320428, -0.10378959, -0.10348456, -0.10312503,
     -0.10276956, -0.10170055, -0.09778279, -0.08608644, -0.05797392,  0.00063599,
      0.08732999,  0.16429878,  0.2223306,   0.25368884,  0.26830932,  0.27313931,
      0.27308756,  0.27048902,  0.26626313,  0.26139534,  0.25634544,  0.2509893 ])
    
    
    # function for x < 1.95 (fitted up to 2.0 for overlap)
    def lowerFunc(x_in): # Bleasdale-Nelder Power With Offset
        # coefficients
        a = -1.1431476643503597E+03
        b = 3.3819340844164983E+21
        c = -6.3633178925040745E+01
        d = 3.1481973843740194E+00
        Offset = -1.0300724909782859E-01
    
        temp = numpy.power(a + b * numpy.power(x_in, c), -1.0 / d)
        temp += Offset
        return temp
    
    # function for x >= 1.95 (fitted down to 1.9 for overlap)
    def upperFunc(x_in): # rational equation with Offset
        # coefficients
        a = -2.5294212380048242E-01
        b = 1.4262697377369586E+00
        c = -2.6141935706529118E-01
        d = -8.8730045918252121E-02
        Offset = -4.8283287597672708E-01
    
        temp = (a * numpy.power(x_in, 2) + b * numpy.log(x_in)) # numerator
        temp /= (1.0 + c * numpy.power(numpy.log(x_in), -1) + d * numpy.exp(x_in)) # denominator
        temp += Offset
        return temp
    
    
    def combinedFunc(x_in):
        returnVal = []
        for x in x_in:
            if x < 1.95:
                returnVal.append(lowerFunc(x))
            else:
                returnVal.append(upperFunc(x))
        return returnVal
    
    
    modelPredictions = combinedFunc(xData) 
    
    absError = modelPredictions - yData
    
    SE = numpy.square(absError) # squared errors
    MSE = numpy.mean(SE) # mean squared errors
    RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
    Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
    print('RMSE:', RMSE)
    print('R-squared:', Rsquared)
    
    
    ##########################################################
    # graphics output section
    def ModelAndScatterPlot(graphWidth, graphHeight):
        f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
        axes = f.add_subplot(111)
    
        # first the raw data as a scatter plot
        axes.plot(xData, yData,  'D')
    
        # create data for the fitted equation plot
        xModel = numpy.linspace(min(xData), max(xData))
        yModel = combinedFunc(xModel)
    
        # now the model as a line plot
        axes.plot(xModel, yModel)
    
        axes.set_xlabel('X Data') # X axis data label
        axes.set_ylabel('Y Data') # Y axis data label
    
        plt.show()
        plt.close('all') # clean up after using pyplot
    
    graphWidth = 800
    graphHeight = 600
    ModelAndScatterPlot(graphWidth, graphHeight)
    
        2
  •  2
  •   Cleb    6 年前

    你可以和我一起玩 s ,例如 s=0.005 ,情节看起来是这样的(仍然不是非常漂亮,但你可以进一步调整):

    enter image description here

    但我确实会使用一个“适当”的函数和fit,例如。 curve_fit . 下面的函数仍然不是理想的,因为它是单调递增的,所以我们忽略了末尾的递减;情节如下:

    enter image description here

    这是样条曲线和实际拟合的完整代码:

    from scipy.interpolate import UnivariateSpline
    import matplotlib.pyplot as plt
    import numpy as np
    from scipy.optimize import curve_fit
    
    
    def func(x, ymax, n, k, c):
        return ymax * x ** n / (k ** n + x ** n) + c
    
    x=np.array([ 1.00094909,  1.08787635,  1.17481363,  1.2617564,   1.34867881,  1.43562284,
      1.52259341,  1.609522,    1.69631283,  1.78276102,  1.86426648,  1.92896789,
      1.9464453,   1.94941586,  2.00062852,  2.073691,    2.14982808,  2.22808316,
      2.30634034,  2.38456905,  2.46280126,  2.54106611,  2.6193345,   2.69748825])
    y=np.array([-0.10057627, -0.10172142, -0.10320428, -0.10378959, -0.10348456, -0.10312503,
     -0.10276956, -0.10170055, -0.09778279, -0.08608644, -0.05797392,  0.00063599,
      0.08732999,  0.16429878,  0.2223306,   0.25368884,  0.26830932,  0.27313931,
      0.27308756,  0.27048902,  0.26626313,  0.26139534,  0.25634544,  0.2509893 ])
    
    
    popt, pcov = curve_fit(func, x, y, p0=[y.max(), 2, 2, -0.1], bounds=([0, 0, 0, -0.2], [0.4, 45, 2000, 10]))
    xfit = np.linspace(x.min(), x.max(), 200)
    plt.scatter(x, y)
    plt.plot(xfit, func(xfit, *popt))
    plt.show()
    
    s = UnivariateSpline(x, y, k=3, s=0.005)
    
    xfit = np.linspace(x.min(), x.max(), 200)
    plt.scatter(x, y)
    plt.plot(xfit, s(xfit))
    plt.show()
    

    第三种选择是使用更高级的函数,该函数还可以在结束和结束时重现减少的值 differential_evolution 适合的;这似乎最适合:

    enter image description here

    代码如下(使用与上述相同的数据):

    from scipy.optimize import curve_fit, differential_evolution    
    
    def sigmoid_with_decay(x, a, b, c, d, e, f):
    
        return a * (1. / (1. + np.exp(-b * (x - c)))) * (1. / (1. + np.exp(d * (x - e)))) + f
    
    def error_sigmoid_with_decay(parameters, x_data, y_data):
    
        return np.sum((y_data - sigmoid_with_decay(x_data, *parameters)) ** 2)
    
    res = differential_evolution(error_sigmoid_with_decay,
                                 bounds=[(0, 10), (0, 25), (0, 10), (0, 10), (0, 10), (-1, 0.1)],
                                 args=(x, y),
                                 seed=42)
    
    xfit = np.linspace(x.min(), x.max(), 200)
    plt.scatter(x, y)
    plt.plot(xfit, sigmoid_with_decay(xfit, *res.x))
    plt.show()