代码之家  ›  专栏  ›  技术社区  ›  Amrith Krishna

PANDAS应用于SEGMETnation故障中的自定义函数结果

  •  0
  • Amrith Krishna  · 技术社区  · 6 年前

    分段故障

    myTrial

    myTrial(pd.ix[3,:])
    newPD2 = pd.head(2).apply(myTrial, axis=1)
    

    newPD2 = pd.head(3).apply(myTrial, axis=1)
    The kernel appears to have died. It will restart automatically.
    

    我的审判 pairwise2.align.globalmx BioPython

    我有一个数据帧,有10000行和8列。我正在起诉一台256 GB RAM的服务器。

    from Bio import pairwise2
    def myTrial(pdf):
        source = pdf['source']
        targ = pdf['target']
    
        if source == targ:
            pdf['sourceAlign'] = source
            pdf['targetAlign'] = source
            pdf['joint'] = source
    
            return pdf
    
        alignments = pairwise2.align.globalmx(source, targ,1,-0.5)
        summaDict = dict()
        for item in alignments:
            lenList = list()
            i = 0
            while i < len(item[0]):
                con = 0
                while item[0][i] == item[1][i]:
                    con += 1
                    i += 1
    
                if con == 0:
                    i += 1
                else:
                    lenList.append((con,item[0][i-con:i],item))
                    con =0
    
            summa = 0
            for thing in lenList:
                summa += (thing[0]*thing[0])
            try:
                summaDict[summa].append(lenList)
            except:
                summaDict[summa] = list()
                summaDict[summa].append(lenList)
        stuff = sorted(summaDict.keys(),reverse=True)[0]
    
        if len(summaDict[stuff]) > 1:
            print(source,targ,summaDict[stuff])
    
        words = summaDict[stuff][0][0][2]
    
        jointWord = ''
        for inda in range(len(words[0])):
            if words[0][inda] == words[1][inda]:
                jointWord += words[0][inda]
            else:
                if words[0][inda] != '-':
                    jointWord += 'DEL('+words[0][inda]+')'
                if words[1][inda] != '-':
                    jointWord += 'INS('+words[1][inda]+')'
    
        pdf['sourceAlign'] = words[0]
        pdf['targetAlign'] = words[1]
        pdf['joint'] = jointWord
    
        return pdf
    

    type |  source |    props | target |    subtype |   p0 |    p1 |    p2 |    p3 |    p4
            0 | ADJ |   najprzytulniejszy | [NEUT, INS, SG] |   najprzytulniejszym |    NaN |   NEUT |  INS |   SG |    None |  None
            1 | ADJ |   sadystyczny |   [MASC, DAT, SG] |   sadystycznemu | NaN |   MASC |  DAT |   SG |    None |  None
            2 | V | wyrzucić |  [FUT, 2, SG] |  wyrzucisz | NaN |   FUT |   2 | SG |    None |  None
            3 | N | świat | [ACC, SG] | świat | NaN |   ACC |   SG |    None |  None |  None
            4 | N | Marsjanin | [INS, PL] | Marsjanami |    NaN |   INS |   PL |    None |  None |  None
    

    [I 19:16:45.709 NotebookApp] Kernel restarted: 604e9df5-6630-4a12-9c13-e9d7a4835da2
    [I 19:17:00.710 NotebookApp] KernelRestarter: restarting kernel (1/5)
    WARNING:root:kernel 604e9df5-6630-4a12-9c13-e9d7a4835da2 restarted
    

    可能的原因是什么?

    1 回复  |  直到 6 年前
        1
  •  1
  •   Han Altae-Tran    6 年前

    bio.pairwise2.globalmx函数导致一个segfault,这是熊猫无法控制的。请看 Biopython pairwise2 for non-ASCII strings