代码之家  ›  专栏  ›  技术社区  ›  U13-Forward

如何将字符串拆分为块,但每个块都有不同的长度-python

  •  1
  • U13-Forward  · 技术社区  · 4 年前

    假设我有这个字符串:

    a = 'abcdefghijklmnopqrstuvwxyz'
    

    我想把这个字符串分成几段,如下所示:

    ['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz  ']
    

    到目前为止,我尝试了以下代码:

    print([a[i: i + i + 1] for i in range(len(a))])
    

    但它输出:

    ['a', 'bc', 'cde', 'defg', 'efghi', 'fghijk', 'ghijklm', 'hijklmno', 'ijklmnopq', 'jklmnopqrs', 'klmnopqrstu', 'lmnopqrstuvw', 'mnopqrstuvwxy', 'nopqrstuvwxyz', 'opqrstuvwxyz', 'pqrstuvwxyz', 'qrstuvwxyz', 'rstuvwxyz', 'stuvwxyz', 'tuvwxyz', 'uvwxyz', 'vwxyz', 'wxyz', 'xyz', 'yz', 'z']
    

    这是我想要的结果:

    0 回复  |  直到 4 年前
        1
  •  5
  •   cs95 abhishek58g    4 年前

    我不认为任何一个线性或for-loop会看起来很优雅,所以让我们用一个发电机:

    from itertools import islice, count
    
    def get_increasing_chunks(s):
        it = iter(s)
        c = count(1)
    
        nxt, c_ = next(it), next(c)
        while nxt:
            yield nxt.ljust(c_)
            nxt, c_ = ''.join(islice(it, c_+1)), next(c)
    
        return out
    
    [*get_increasing_chunks(a)]
    # ['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz  ']
    
        2
  •  4
  •   U13-Forward    4 年前

    多亏@Prune的评论,我终于想出了解决这个问题的办法:

    a = 'abcdefghijklmnopqrstuvwxyz'
    lst = []
    c = 0
    for i in range(1, len(a) + 1):
        c += i
        lst.append(c)
    print([a[x: y] + ' ' * (i - len(a[x: y])) for i, (x, y) in enumerate(zip([0] + lst, lst), 1) if a[x: y]])    
        
    

    输出:

    ['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz  ']
    

    我发现三角形的数字比一个列表的理解,并添加空格,如果长度不正确。

        3
  •  3
  •   Arturo    4 年前

    所以你需要的是有一个数字来控制你要抓取多少个字符(在本例中是迭代的数量),第二个数字记住最后一个索引是什么,再加上最后一个数字告诉你在哪里停止。

    my_str = "abcdefghijklmnopqrstuvwxyz"
    last_index = 0
    index = 1
    iter_count = 1
    
    while True:
        sub_string = my_str[last_index:index]
        print(sub_string)
        last_index = index
        iter_count += 1
        index = index + iter_count
        if last_index > len(my_str):
            break
    

    注意,您不需要while循环。我只是觉得懒

        4
  •  3
  •   jkr    4 年前

    好像 split_into recipe more_itertools the answer by @cs95 ,但也许这会帮助其他人发现 itertools module

    >>> list(split_into([1,2,3,4,5,6], [1,2,3]))
    [[1], [2, 3], [4, 5, 6]]
    

    sizes [1, 2, 3, 3, 5, 6, 7] .

    import itertools
    
    def split_into(iterable, sizes):
        it = iter(iterable)
        for size in sizes:
            if size is None:
                yield list(it)
                return
            else:
                yield list(itertools.islice(it, size))
    
    a = 'abcdefghijklmnopqrstuvwxyz'
    
    sizes = [1]
    while sum(sizes) <= len(a):
        next_value = sizes[-1] + 1
        sizes.append(next_value)
    # sizes = [1, 2, 3, 4, 5, 6, 7]
    
    list(split_into(a, sizes))
    
    # [['a'],
    #  ['b', 'c'],
    #  ['d', 'e', 'f'],
    #  ['g', 'h', 'i', 'j'],
    #  ['k', 'l', 'm', 'n', 'o'],
    #  ['p', 'q', 'r', 's', 't', 'u'],
    #  ['v', 'w', 'x', 'y', 'z']]
    
    chunks = list(map("".join, split_into(a, sizes)))
    # ['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz']
    
    # Pad last item with whitespace.
    chunks[-1] = chunks[-1].ljust(sizes[-1], " ")
    # ['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz  ']
    
        5
  •  2
  •   Chris Charley    4 年前

    accumulate

    >>> from itertools import accumulate
    >>> from string import ascii_lowercase
    
    >>> s = ascii_lowercase
    >>> n = 0
    >>> accum = 0
    >>> while accum < len(s):
        n += 1
        accum += n
    
    >>> L = [s[j:i+j] for i, j in enumerate(accumulate(range(n)), 1)]
    
    >>> L[-1] += ' ' * (n-len(L[-1]))
    >>> L
    ['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz  ']
    

    n = 0
    accum = 0
    L = []
    while accum < len(s):
        n += 1
        L.append(s[accum:accum+n])
        accum += n
    
    ['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz']
    
        6
  •  1
  •   ppwater Azbilegt Chuluunbat    4 年前

    给U11 Forward的答案加一点:

    a = 'abcdefghijklmnopqrstuvwxyz'
    l = list(range(len(a))) # numberes list / 1 to len(a)
    triangular = [sum(l[:i+2]) for i in l] # sum of 1, 2 and 1,2,3 and 1,2,3,4 and etc
    print([a[x: y].ljust(i, ' ') for i, (x, y) in enumerate(zip([0] + triangular, triangular), 1) if a[x: y]])
    

    ['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz  ']
    

    找出三角形的数字,做一个列表理解和填补空格,如果长度是不正确的。

        7
  •  1
  •   Balaji    4 年前
    a = 'abcdefghijklmnopqrstuvwxyz'
    inc = 0
    output = []
    for i in range(0, len(a)):
        print(a[inc: inc+i+1])
        inc = inc+i+1
        if inc > len(a):
            break
        output.append(a[inc: inc+i+1])        
            
    print(output)
    

    ['b', 'de', 'ghi', 'klmn', 'pqrst', 'vwxyz']