代码之家  ›  专栏  ›  技术社区  ›  xvlaze

在Python列表中获取公共对象实例的列表[duplicate]

  •  -3
  • xvlaze  · 技术社区  · 6 年前
    a = [1,2,3,4,5]
    b = [1,3,5,6]
    c = a and b
    print c
    

    实际产量: [1,3,5,6] [1,3,5]

    0 回复  |  直到 6 年前
        1
  •  417
  •   Mark Byers    14 年前

    如果顺序不重要,并且不需要担心重复,则可以使用“设置交集”:

    >>> a = [1,2,3,4,5]
    >>> b = [1,3,5,6]
    >>> list(set(a) & set(b))
    [1, 3, 5]
    
        2
  •  99
  •   Lodewijk    9 年前

    [x for x in a if x in b]

    或者“如果x值在B中,则所有x值都在A中”。

        3
  •  72
  •   River Marcus Downing    8 年前

    如果将两个列表中较大的一个转换为一个集合,则可以使用 intersection() :

    a = [1,2,3,4,5]
    b = [1,3,5,6]
    set(a).intersection(b)
    
        4
  •  31
  •   Alex Martelli    14 年前

    _auxset = set(a)
    

    然后,

    c = [x for x in b if x in _auxset]
    

    会做你想做的(保存 b 是点菜,不是 a 不一定能保存 二者都 )去做吧 快速的 if x in a 因为列表中的条件理解也会起作用,并且避免了构建的需要 _auxset 但不幸的是,对于相当长的列表,它将慢得多)。

    c = sorted(set(a).intersection(b))
    
        5
  •  21
  •   PM 2Ring    9 年前

    下面是一些python2/python3代码,它们为查找两个列表的交集的基于列表和基于集合的方法生成计时信息。

    纯列表理解算法是O(n^2),因为 in n 但基于小算法的速度更快 创建集合的开销使它们比纯列表comp算法慢。

    #!/usr/bin/env python
    
    ''' Time list- vs set-based list intersection
        See http://stackoverflow.com/q/3697432/4014959
        Written by PM 2Ring 2015.10.16
    '''
    
    from __future__ import print_function, division
    from timeit import Timer
    
    setup = 'from __main__ import a, b'
    cmd_lista = '[u for u in a if u in b]'
    cmd_listb = '[u for u in b if u in a]'
    
    cmd_lcsa = 'sa=set(a);[u for u in b if u in sa]'
    
    cmd_seta = 'list(set(a).intersection(b))'
    cmd_setb = 'list(set(b).intersection(a))'
    
    reps = 3
    loops = 50000
    
    def do_timing(heading, cmd, setup):
        t = Timer(cmd, setup)
        r = t.repeat(reps, loops)
        r.sort()
        print(heading, r)
        return r[0]
    
    m = 10
    nums = list(range(6 * m))
    
    for n in range(1, m + 1):
        a = nums[:6*n:2]
        b = nums[:6*n:3]
        print('\nn =', n, len(a), len(b))
        #print('\nn = %d\n%s %d\n%s %d' % (n, a, len(a), b, len(b)))
        la = do_timing('lista', cmd_lista, setup) 
        lb = do_timing('listb', cmd_listb, setup) 
        lc = do_timing('lcsa ', cmd_lcsa, setup)
        sa = do_timing('seta ', cmd_seta, setup)
        sb = do_timing('setb ', cmd_setb, setup)
        print(la/sa, lb/sa, lc/sa, la/sb, lb/sb, lc/sb)
    

    输出

    n = 1 3 2
    lista [0.082171916961669922, 0.082588911056518555, 0.0898590087890625]
    listb [0.069530963897705078, 0.070394992828369141, 0.075379848480224609]
    lcsa  [0.11858987808227539, 0.1188349723815918, 0.12825107574462891]
    seta  [0.26900982856750488, 0.26902294158935547, 0.27298116683959961]
    setb  [0.27218389511108398, 0.27459001541137695, 0.34307217597961426]
    0.305460649521 0.258469975867 0.440838458259 0.301898526833 0.255455833892 0.435697630214
    
    n = 2 6 4
    lista [0.15915989875793457, 0.16000485420227051, 0.16551494598388672]
    listb [0.13000702857971191, 0.13060092926025391, 0.13543915748596191]
    lcsa  [0.18650484085083008, 0.18742108345031738, 0.19513416290283203]
    seta  [0.33592700958251953, 0.34001994132995605, 0.34146714210510254]
    setb  [0.29436492919921875, 0.2953648567199707, 0.30039691925048828]
    0.473793098554 0.387009751735 0.555194537893 0.540689066428 0.441652573672 0.633583767462
    
    n = 3 9 6
    lista [0.27657914161682129, 0.28098297119140625, 0.28311991691589355]
    listb [0.21585917472839355, 0.21679902076721191, 0.22272896766662598]
    lcsa  [0.22559309005737305, 0.2271728515625, 0.2323150634765625]
    seta  [0.36382699012756348, 0.36453008651733398, 0.36750602722167969]
    setb  [0.34979605674743652, 0.35533690452575684, 0.36164689064025879]
    0.760194128313 0.59330170819 0.62005595016 0.790686848184 0.61710008036 0.644927481902
    
    n = 4 12 8
    lista [0.39616990089416504, 0.39746403694152832, 0.41129183769226074]
    listb [0.33485794067382812, 0.33914685249328613, 0.37850618362426758]
    lcsa  [0.27405810356140137, 0.2745978832244873, 0.28249192237854004]
    seta  [0.39211201667785645, 0.39234519004821777, 0.39317893981933594]
    setb  [0.36988520622253418, 0.37011313438415527, 0.37571001052856445]
    1.01034878821 0.85398540833 0.698928091731 1.07106176249 0.905302334456 0.740927452493
    
    n = 5 15 10
    lista [0.56792402267456055, 0.57422614097595215, 0.57740211486816406]
    listb [0.47309303283691406, 0.47619009017944336, 0.47628307342529297]
    lcsa  [0.32805585861206055, 0.32813096046447754, 0.3349759578704834]
    seta  [0.40036201477050781, 0.40322518348693848, 0.40548801422119141]
    setb  [0.39103078842163086, 0.39722800254821777, 0.43811702728271484]
    1.41852623806 1.18166313332 0.819398061028 1.45237674242 1.20986133789 0.838951479847
    
    n = 6 18 12
    lista [0.77897095680236816, 0.78187918663024902, 0.78467702865600586]
    listb [0.629547119140625, 0.63210701942443848, 0.63321495056152344]
    lcsa  [0.36563992500305176, 0.36638498306274414, 0.38175487518310547]
    seta  [0.46695613861083984, 0.46992206573486328, 0.47583580017089844]
    setb  [0.47616910934448242, 0.47661614418029785, 0.4850609302520752]
    1.66818870637 1.34819326075 0.783028414812 1.63591241329 1.32210827369 0.767878297495
    
    n = 7 21 14
    lista [0.9703209400177002, 0.9734041690826416, 1.0182771682739258]
    listb [0.82394003868103027, 0.82625699043273926, 0.82796716690063477]
    lcsa  [0.40975093841552734, 0.41210508346557617, 0.42286920547485352]
    seta  [0.5086359977722168, 0.50968098640441895, 0.51014018058776855]
    setb  [0.48688101768493652, 0.4879908561706543, 0.49204087257385254]
    1.90769222837 1.61990115188 0.805587768483 1.99293236904 1.69228211566 0.841583309951
    
    n = 8 24 16
    lista [1.204819917678833, 1.2206029891967773, 1.258256196975708]
    listb [1.014998197555542, 1.0206191539764404, 1.0343101024627686]
    lcsa  [0.50966787338256836, 0.51018595695495605, 0.51319599151611328]
    seta  [0.50310111045837402, 0.50556015968322754, 0.51335406303405762]
    setb  [0.51472997665405273, 0.51948785781860352, 0.52113485336303711]
    2.39478683834 2.01748351664 1.01305257092 2.34068341135 1.97190418975 0.990165516871
    
    n = 9 27 18
    lista [1.511646032333374, 1.5133969783782959, 1.5639569759368896]
    listb [1.2461750507354736, 1.254518985748291, 1.2613379955291748]
    lcsa  [0.5565330982208252, 0.56119203567504883, 0.56451296806335449]
    seta  [0.5966339111328125, 0.60275578498840332, 0.64791703224182129]
    setb  [0.54694414138793945, 0.5508568286895752, 0.55375313758850098]
    2.53362406013 2.08867620074 0.932788243907 2.76380331728 2.27843203069 1.01753187594
    
    n = 10 30 20
    lista [1.7777848243713379, 2.1453688144683838, 2.4085969924926758]
    listb [1.5070111751556396, 1.5202279090881348, 1.5779800415039062]
    lcsa  [0.5954139232635498, 0.59703707695007324, 0.60746097564697266]
    seta  [0.61563014984130859, 0.62125110626220703, 0.62354087829589844]
    setb  [0.56723213195800781, 0.57257509231567383, 0.57460403442382812]
    2.88774814689 2.44791645689 0.967161734066 3.13413984189 2.6567803378 1.04968299523
    

    在Debian风格的Linux上,使用2GHz单核机器和2GB RAM运行Python2.6.6(Firefox在后台运行)生成。

    这些数字只是一个粗略的指导,因为不同算法的实际速度受两个源列表中元素比例的影响不同。

        6
  •  11
  •   Delimitry COLD TOLD    6 年前
    a = [1,2,3,4,5]
    b = [1,3,5,6]
    c = list(set(a).intersection(set(b)))
    

        7
  •  6
  •   Saftography    5 年前

    可以使用 filter lambda 操作员。

    list1 = [1,2,3,4,5,6]
    
    list2 = [2,4,6,9,10]
    
    >>> list(filter(lambda x:x in list1, list2))
    
    [2, 4, 6]
    

    编辑:它过滤掉列表1和列表中都存在的x,也可以使用以下方法实现设置差异:

    >>> list(filter(lambda x:x not in list1, list2))
    [9,10]
    

    编辑2:蟒蛇3 滤波器 返回一个筛选器对象,用 list 返回输出列表。

        8
  •  1
  •   Arturo Morales Rangel    6 年前

    这是一个例子,当您需要结果中的每个元素在两个数组中显示的次数相同时。

    def intersection(nums1, nums2):
        #example:
        #nums1 = [1,2,2,1]
        #nums2 = [2,2]
        #output = [2,2]
        #find first 2 and remove from target, continue iterating
    
        target, iterate = [nums1, nums2] if len(nums2) >= len(nums1) else [nums2, nums1] #iterate will look into target
    
        if len(target) == 0:
                return []
    
        i = 0
        store = []
        while i < len(iterate):
    
             element = iterate[i]
    
             if element in target:
                   store.append(element)
                   target.remove(element)
    
             i += 1
    
    
        return store
    
        9
  •  1
  •   anabeto93    5 年前

    这可能晚了,但我只是想我应该分享的情况下,你需要手动做(显示工作-哈哈)或当你需要所有元素出现尽可能多次,或当你也需要它是唯一的。

    请注意,也为它编写了测试。

    
    
        from nose.tools import assert_equal
    
        '''
        Given two lists, print out the list of overlapping elements
        '''
    
        def overlap(l_a, l_b):
            '''
            compare the two lists l_a and l_b and return the overlapping
            elements (intersecting) between the two
            '''
    
            #edge case is when they are the same lists
            if l_a == l_b:
                return [] #no overlapping elements
    
            output = []
    
            if len(l_a) == len(l_b):
                for i in range(l_a): #same length so either one applies
                    if l_a[i] in l_b:
                        output.append(l_a[i])
    
                #found all by now
                #return output #if repetition does not matter
                return list(set(output))
    
            else:
                #find the smallest and largest lists and go with that
                sm = l_a if len(l_a)  len(l_b) else l_b
    
                for i in range(len(sm)):
                    if sm[i] in lg:
                        output.append(sm[i])
    
                #return output #if repetition does not matter
                return list(set(output))
    
        ## Test the Above Implementation
    
        a = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
        b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
        exp = [1, 2, 3, 5, 8, 13]
    
        c = [4, 4, 5, 6]
        d = [5, 7, 4, 8 ,6 ] #assuming it is not ordered
        exp2 = [4, 5, 6]
    
        class TestOverlap(object):
    
            def test(self, sol):
                t = sol(a, b)
                assert_equal(t, exp)
                print('Comparing the two lists produces')
                print(t)
    
                t = sol(c, d)
                assert_equal(t, exp2)
                print('Comparing the two lists produces')
                print(t)
    
                print('All Tests Passed!!')
    
        t = TestOverlap()
        t.test(overlap)
    
    
        10
  •  0
  •   Tim McNamara    14 年前

    如果布尔AND是指两个列表中都出现的项,例如交集,那么您应该看看Python的 set frozenset 类型。

        11
  •  0
  •   Atonal    4 年前

    你也可以使用计数器!它不保留顺序,但会考虑重复项:

    >>> from collections import Counter
    >>> a = [1,2,3,4,5]
    >>> b = [1,3,5,6]
    >>> d1, d2 = Counter(a), Counter(b)
    >>> c = [n for n in d1.keys() & d2.keys() for _ in range(min(d1[n], d2[n]))]
    >>> print(c)
    [1,3,5]
    
        12
  •  0
  •   Saisankar Gochhayat    4 年前

    这样可以得到两个列表的交集,也可以得到常见的重复项。

    >>> from collections import Counter
    >>> a = Counter([1,2,3,4,5])
    >>> b = Counter([1,3,5,6])
    >>> a &= b
    >>> list(a.elements())
    [1, 3, 5]