代码之家  ›  专栏  ›  技术社区  ›  Ruggero Turra

scipy智能优化

  •  0
  • Ruggero Turra  · 技术社区  · 15 年前

    我需要用直线拟合来自不同数据集的一些点。从每个数据集中,我都想适合一行。所以我得到了描述i线的参数ai和bi:ai+bi*x,问题是我想强制每个ai相等,因为我想要相同的截取。我在这里找到了一个教程: http://www.scipy.org/Cookbook/FittingData#head-a44b49d57cf0165300f765e8f1b011876776502f . 区别在于我不知道一个priopri有多少数据集。我的代码是:

    from numpy import *
    from scipy import optimize
    
    # here I have 3 dataset, but in general I don't know how many dataset are they
    ypoints = [array([0, 2.1, 2.4]),    # first dataset, 3 points
               array([0.1, 2.1, 2.9]),  # second dataset
               array([-0.1, 1.4])]      # only 2 points
    
    xpoints = [array([0, 2, 2.5]),      # first dataset
               array([0, 2, 3]),        # second, also x coordinates are different
               array([0, 1.5])]         # the first coordinate is always 0
    
    fitfunc = lambda a, b, x: a + b * x
    errfunc = lambda p, xs, ys: array([ yi - fitfunc(p[0], p[i+1], xi) 
                                        for i, (xi,yi) in enumerate(zip(xs, ys)) ])
    
    
    p_arrays = [r_[0.]] * len(xpoints)
    pinit = r_[[ypoints[0][0]] + p_arrays]
    fit_parameters, success = optimize.leastsq(errfunc, pinit, args = (xpoints, ypoints))
    

    我得到了

    Traceback (most recent call last):
      File "prova.py", line 19, in <module>
        fit_parameters, success = optimize.leastsq(errfunc, pinit, args = (xpoints,    ypoints))
      File "/usr/lib64/python2.6/site-packages/scipy/optimize/minpack.py", line 266, in  leastsq
        m = check_func(func,x0,args,n)[0]
      File "/usr/lib64/python2.6/site-packages/scipy/optimize/minpack.py", line 12, in  check_func
        res = atleast_1d(thefunc(*((x0[:numinputs],)+args)))
      File "prova.py", line 14, in <lambda>
        for i, (xi,yi) in enumerate(zip(xs, ys)) ])
    ValueError: setting an array element with a sequence.
    
    2 回复  |  直到 15 年前
        1
  •  1
  •   Josef    15 年前

    如果您只需要一个线性拟合,那么最好用线性回归来估计它,而不是用非线性优化器。 使用scikits.statsmodels可以获得更合适的统计数据。

    import numpy as np
    from numpy import array
    
    ypoints = np.r_[array([0, 2.1, 2.4]),    # first dataset, 3 points
               array([0.1, 2.1, 2.9]),  # second dataset
               array([-0.1, 1.4])]      # only 2 points
    
    xpoints = [array([0, 2, 2.5]),      # first dataset
               array([0, 2, 3]),        # second, also x coordinates are different
               array([0, 1.5])]         # the first coordinate is always 0
    
    xp = np.hstack(xpoints)
    indicator = []
    for i,a in enumerate(xpoints):
        indicator.extend([i]*len(a))
    
    indicator = np.array(indicator)
    
    
    x = xp[:,None]*(indicator[:,None]==np.arange(3)).astype(int)
    x = np.hstack((np.ones((xp.shape[0],1)),x))
    
    print np.dot(np.linalg.pinv(x), ypoints)
    # [ 0.01947973  0.98656987  0.98481549  0.92034684]
    

    回归量矩阵有一个共同的截距,但每个数据集的列不同:

    >>> x
    array([[ 1. ,  0. ,  0. ,  0. ],
           [ 1. ,  2. ,  0. ,  0. ],
           [ 1. ,  2.5,  0. ,  0. ],
           [ 1. ,  0. ,  0. ,  0. ],
           [ 1. ,  0. ,  2. ,  0. ],
           [ 1. ,  0. ,  3. ,  0. ],
           [ 1. ,  0. ,  0. ,  0. ],
           [ 1. ,  0. ,  0. ,  1.5]])
    
        2
  •  1
  •   Alex Martelli    15 年前

    (边注:使用 def 不是 lambda 分配给一个名字——这太愚蠢了,除了缺点什么都没有, 兰姆达 唯一的用途是 匿名的 函数!.

    你的 errfunc 应该返回一个浮点数序列(数组或其他),但它不是,因为您正试图将数组中的每一个差异作为数组的项 y 要点(记住, ypoints 阿卡 ys 是数组列表!)以及拟合函数的结果。所以你需要“折叠”表达式 yi - fitfunc(p[0], p[i+1], xi) 到一个浮点数,例如 norm(yi - fitfunc(p[0], p[i+1], xi)) .