代码之家  ›  专栏  ›  技术社区  ›  E235

如何生成5个字符的字符串组合(两个不同的数字、两个相等的字母和一个字母)而不重复

  •  1
  • E235  · 技术社区  · 6 年前

    similar question 不同的 问题。
    我正在尝试生成一个由5个字符组成的字符串的组合 不同的 但当我试着这么做的时候我得到了复制品。

    对的 组合:

    82ccb  
    b8cc7  
    7c6dc  
    

    例如 组合:

    22ddc  -> incorrect because the digits are equal and should be different
    46ddd  -> incorrect because there are more than 2 equal letters
    2t4cd  -> No 2 equal letters + 2 equal different letters  
    

    LETTERS = 'bcdfghjklmnpqrstvwxz'
    DIGITS = '2456789'
    
    def aab12(letters=LETTERS, digits=DIGITS):
        """Generate the distinct 5-character strings consisting of three
        letters (two are equal and a repeated letter) and two digits (each one is different from the other).
    
        """
        letterdxs = set(range(5))
        combs = []
        for (d1, d2), (i, j), (l1, l2) in product(
                permutations(digits, 2),       # 2 digits (a repeated).
                combinations(range(5), 2),     # Positions for the 1st and 2nd digits.
                permutations(letters, 2)):     # 2 letters (a repeated).
    
            x, y, z = letterdxs.difference((i, j))
            s = set((x, y, z))
            # Choosing 2 positions for the repeated letters
            c1 = combinations((x, y, z), 2) 
            for c in c1:
                result = []
                result[i:i] = d1,
                result[j:j] = d2,
                result[c[0]:c[0]] = l1,
                result[c[1]:c[1]] = l1,
                # Choosing position for the last letter. This is position that was left
                letter_indx = (s.difference(c)).pop()
                result[letter_indx:letter_indx] = l2,
                combs.append(''.join(result))
        # Should be 478,800
        print(len(combs))
        return combs
    
    def is_contain_dup(combos):
        s = set(combos)
        if len(s) != len(combos):
           print('found duplicates !')
    
    is_contain_dup(aab12())
    

    我有复制品,虽然长度还可以。
    此函数基于以下数学:
    enter image description here

    • 为不同的数字选择两个位置
    • 为重复的字母选择2个位置
    • 从上一个字母中选择不同的字母

    我不确定是什么导致了重复,但这可能是由于选择了两个相等的字母+不同的字母。

    3 回复  |  直到 6 年前
        1
  •  2
  •   dawg    6 年前

    这是一个纯暴力、天真的方法,有4个嵌套循环:

    LETTERS = 'bcdfghjklmnpqrstvwxz'
    DIGITS = '2456789'
    from itertools import permutations
    
    def aab12_1(letters=LETTERS, digits=DIGITS):
        st=[]
        for fc in letters:
            for sc in letters:
                if sc==fc: continue
                for n1 in digits:
                    for n2 in digits:
                        if n1==n2: continue 
                        st.append(''.join((fc,fc,sc,n1,n2)))
        di={e:[''.join(t) for t in permutations(e)] for e in st}    
        return {s for sl in di.values() for s in sl}    
    
    >>> r=aab12_1()
    >>> len(r)
    478800
    

    这已经 O(n**4) 复杂性;即, 真的很糟糕

    您可以通过对生成的基本字符串进行排序来减少对的重复调用,从而降低复杂性 permutations :

    def aab12_2(letters=LETTERS, digits=DIGITS):
        st=set()
        for fc in letters:
            for sc in letters:
                if sc==fc: continue
                for n1 in digits:
                    for n2 in digits:
                        if n1==n2: continue 
                        st.add(''.join(sorted((fc,fc,sc,n1,n2))))
    
        di={e:[''.join(t) for t in permutations(e)] for e in st}    
        return {s for sl in di.values() for s in sl}    
    

    from itertools import permutations, product, combinations
    
    def aab12_3(letters=LETTERS, digits=DIGITS):
        let_combo=[x+y for x,y in product([e+e for e in letters],letters) if x[0]!=y]   
        n_combos={a+b for a,b in combinations(digits,2)}
        di={e:[''.join(t) for t in permutations(e)] for e in (x+y for x,y in product(let_combo, n_combos))} 
        return {s for sl in di.values() for s in sl} 
    

    这仍然隐含着 O(n**3) 与3 products() 这相当于每个循环的嵌套循环。每个 O

    所以,让我们做个基准测试。下面是上面的3个函数:Ajax1234的递归函数和Rory Daulton的itertools函数:

    from itertools import combinations, permutations, product
    
    def aab12_1(letters=LETTERS, digits=DIGITS):
        st=[]
        for fc in letters:
            for sc in letters:
                if sc==fc: continue
                for n1 in digits:
                    for n2 in digits:
                        if n1==n2: continue 
                        st.append(''.join((fc,fc,sc,n1,n2)))
        di={e:[''.join(t) for t in permutations(e)] for e in st}    
        return {s for sl in di.values() for s in sl}    
    
    def aab12_2(letters=LETTERS, digits=DIGITS):
        st=set()
        for fc in letters:
            for sc in letters:
                if sc==fc: continue
                for n1 in digits:
                    for n2 in digits:
                        if n1==n2: continue 
                        st.add(''.join(sorted((fc,fc,sc,n1,n2))))
        di={e:[''.join(t) for t in permutations(e)] for e in st}    
        return {s for sl in di.values() for s in sl}    
    
    
    def aab12_3(letters=LETTERS, digits=DIGITS):
        let_combo=[x+y for x,y in product([e+e for e in letters],letters) if x[0]!=y]   
        n_combos={a+b for a,b in combinations(digits,2)}
        di={e:[''.join(t) for t in permutations(e)] for e in (x+y for x,y in product(let_combo, n_combos))} 
        return {s for sl in di.values() for s in sl} 
    
    def aab12_4():
    # Ajax1234 recursive approach
        def validate(val, queue, counter):
            if not queue:
                return True
            if val.isdigit():
                return sum(i.isdigit() for i in queue) < 2 and val not in queue
            _sum = sum(i.isalpha() for i in counter)
            return _sum < 3 and counter.get(val, 0) < 2
    
        def is_valid(_input):
            d = Counter(_input)
            return sum(i.isdigit() for i in d) == 2 and sum(i.isalpha() for i in d) == 2
    
        def combinations(d, current = []):
            if len(current) == 5:
                yield ''.join(current)
            else:
                for i in d:
                    if validate(i, current, Counter(current)):
                        yield from combinations(d, current+[i])
    
        return [i for i in combinations(DIGITS+LETTERS) if is_valid(i)]
    
    
    def aab12_5(letters=LETTERS, digits=DIGITS):
        """ Rory Daulton
        Generate the distinct 5-character strings consisting of three
        letters (two are equal and a repeated letter) and two digits (each
        one is different from the other).
        """
        indices = range(5)  # indices for the generated 5-char strings
        combs = []
        for (letterdbl, lettersngl), (digit1, digit2), (indx1, indx2, indx3) in (
                product(permutations(letters, 2),
                combinations(digits, 2),
                permutations(indices, 3))):
            charlist = [letterdbl] * 5
            charlist[indx1] = lettersngl
            charlist[indx2] = digit1
            charlist[indx3] = digit2
            combs.append(''.join(charlist))
        return combs    
    
    if __name__=='__main__':
        import timeit
        funcs=(aab12_1,aab12_2,aab12_3,aab12_4,aab12_5)
        di={f.__name__:len(set(f())) for f in funcs}
        print(di)
        for f in funcs:
            print("   {:^10s}{:.4f} secs".format(f.__name__, timeit.timeit("f()", setup="from __main__ import f", number=1))) 
    

    印刷品:

    {'aab12_1': 478800, 'aab12_2': 478800, 'aab12_3': 478800, 'aab12_4': 478800, 'aab12_5': 478800}
       aab12_1  0.6230 secs
       aab12_2  0.3433 secs
       aab12_3  0.3292 secs
       aab12_4  50.4786 secs
       aab12_5  0.2094 secs
    

        2
  •  2
  •   Ajax1234    6 年前

    可以创建递归函数:

    from collections import Counter
    LETTERS = 'bcdfghjklmnpqrstvwxz'
    DIGITS = '2456789'
    def validate(val, queue, counter):
      if not queue:
        return True
      if val.isdigit():
        return sum(i.isdigit() for i in queue) < 2 and val not in queue
      _sum = sum(i.isalpha() for i in counter)
      return _sum < 3 and counter.get(val, 0) < 2
    
    def is_valid(_input):
      d = Counter(_input)
      return sum(i.isdigit() for i in d) == 2 and sum(i.isalpha() for i in d) == 2
    
    def combinations(d, current = []):
      if len(current) == 5:
        yield ''.join(current)
      else:
        for i in d:
          if validate(i, current, Counter(current)):
            yield from combinations(d, current+[i])
    
    _r = [i for i in combinations(DIGITS+LETTERS) if is_valid(i)]
    print(len(_r))
    

    输出:

    478800
    
        3
  •  2
  •   Rory Daulton    6 年前

    这里有一个答案,它使用了多个函数 itertools 功能,以最大限度地提高速度。这将返回所需的478800个字符串,它们都是不同的。

    跑步 %timeit aab12() 在我的系统上给出时间结果

    391 ms ± 2.34 ms per loop
    

    """
    Strategy: Generate a permutation of 2 distinct characters, a
    combination of 2 distinct digits, and a permutation of 3 distinct
    indices in range(5). Then build a 5-char string of the first character
    (which will be the repeated one), use the first two indices to place
    the digits and the last index to place the non-repeated character.
    This yields a total of (20*19) * (7/1*6/2) * (5*4*3) = 478,800
    items.
    """
    from itertools import combinations, permutations, product
    
    LETTERS = 'bcdfghjklmnpqrstvwxz'
    DIGITS = '2456789'
    
    def aab12(letters=LETTERS, digits=DIGITS):
        """Generate the distinct 5-character strings consisting of three
        letters (two are equal and a repeated letter) and two digits (each
        one is different from the other).
        """
        indices = range(5)  # indices for the generated 5-char strings
        combs = []
        for (letterdbl, lettersngl), (digit1, digit2), (indx1, indx2, indx3) in (
                product(permutations(letters, 2),
                        combinations(digits, 2),
                        permutations(indices, 3))):
            charlist = [letterdbl] * 5
            charlist[indx1] = digit1
            charlist[indx2] = digit2
            charlist[indx3] = lettersngl
            combs.append(''.join(charlist))
        return combs
    

    列表中的前15个字符串是

    ['24cbb',
     '24bcb',
     '24bbc',
     '2c4bb',
     '2b4cb',
     '2b4bc',
     '2cb4b',
     '2bc4b',
     '2bb4c',
     '2cbb4',
     '2bcb4',
     '2bbc4',
     '42cbb',
     '42bcb',
     '42bbc']