代码之家 › 专栏 › 技术社区 › U13-Forward

如何将字符串拆分为块,但每个块都有不同的长度-python

chunks split list string python

U13-Forward · 技术社区 · 4 年前

假设我有这个字符串:

a = 'abcdefghijklmnopqrstuvwxyz'

我想把这个字符串分成几段,如下所示:

['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz  ']

到目前为止,我尝试了以下代码:

print([a[i: i + i + 1] for i in range(len(a))])

但它输出:

['a', 'bc', 'cde', 'defg', 'efghi', 'fghijk', 'ghijklm', 'hijklmno', 'ijklmnopq', 'jklmnopqrs', 'klmnopqrstu', 'lmnopqrstuvw', 'mnopqrstuvwxy', 'nopqrstuvwxyz', 'opqrstuvwxyz', 'pqrstuvwxyz', 'qrstuvwxyz', 'rstuvwxyz', 'stuvwxyz', 'tuvwxyz', 'uvwxyz', 'vwxyz', 'wxyz', 'xyz', 'yz', 'z']

这是我想要的结果:

0 回复 | 直到 4 年前

cs95 abhishek58g 4 年前

我不认为任何一个线性或for-loop会看起来很优雅,所以让我们用一个发电机:

from itertools import islice, count

def get_increasing_chunks(s):
    it = iter(s)
    c = count(1)

    nxt, c_ = next(it), next(c)
    while nxt:
        yield nxt.ljust(c_)
        nxt, c_ = ''.join(islice(it, c_+1)), next(c)

    return out

[*get_increasing_chunks(a)]
# ['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz  ']

U13-Forward 4 年前

多亏@Prune的评论,我终于想出了解决这个问题的办法:

a = 'abcdefghijklmnopqrstuvwxyz'
lst = []
c = 0
for i in range(1, len(a) + 1):
    c += i
    lst.append(c)
print([a[x: y] + ' ' * (i - len(a[x: y])) for i, (x, y) in enumerate(zip([0] + lst, lst), 1) if a[x: y]])

输出:

['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz  ']

我发现三角形的数字比一个列表的理解,并添加空格,如果长度不正确。

Arturo 4 年前

所以你需要的是有一个数字来控制你要抓取多少个字符(在本例中是迭代的数量),第二个数字记住最后一个索引是什么,再加上最后一个数字告诉你在哪里停止。

my_str = "abcdefghijklmnopqrstuvwxyz"
last_index = 0
index = 1
iter_count = 1

while True:
    sub_string = my_str[last_index:index]
    print(sub_string)
    last_index = index
    iter_count += 1
    index = index + iter_count
    if last_index > len(my_str):
        break

注意,您不需要while循环。我只是觉得懒

jkr 4 年前

好像 split_into recipe 在 more_itertools the answer by @cs95 ,但也许这会帮助其他人发现 itertools module

>>> list(split_into([1,2,3,4,5,6], [1,2,3]))
[[1], [2, 3], [4, 5, 6]]

sizes [1, 2, 3, 3, 5, 6, 7] .

import itertools

def split_into(iterable, sizes):
    it = iter(iterable)
    for size in sizes:
        if size is None:
            yield list(it)
            return
        else:
            yield list(itertools.islice(it, size))

a = 'abcdefghijklmnopqrstuvwxyz'

sizes = [1]
while sum(sizes) <= len(a):
    next_value = sizes[-1] + 1
    sizes.append(next_value)
# sizes = [1, 2, 3, 4, 5, 6, 7]

list(split_into(a, sizes))

# [['a'],
#  ['b', 'c'],
#  ['d', 'e', 'f'],
#  ['g', 'h', 'i', 'j'],
#  ['k', 'l', 'm', 'n', 'o'],
#  ['p', 'q', 'r', 's', 't', 'u'],
#  ['v', 'w', 'x', 'y', 'z']]

chunks = list(map("".join, split_into(a, sizes)))
# ['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz']

# Pad last item with whitespace.
chunks[-1] = chunks[-1].ljust(sizes[-1], " ")
# ['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz  ']

Chris Charley 4 年前

accumulate

>>> from itertools import accumulate
>>> from string import ascii_lowercase

>>> s = ascii_lowercase
>>> n = 0
>>> accum = 0
>>> while accum < len(s):
    n += 1
    accum += n

>>> L = [s[j:i+j] for i, j in enumerate(accumulate(range(n)), 1)]

>>> L[-1] += ' ' * (n-len(L[-1]))
>>> L
['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz  ']

n = 0
accum = 0
L = []
while accum < len(s):
    n += 1
    L.append(s[accum:accum+n])
    accum += n

['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz']

ppwater Azbilegt Chuluunbat 4 年前

给U11 Forward的答案加一点:

a = 'abcdefghijklmnopqrstuvwxyz'
l = list(range(len(a))) # numberes list / 1 to len(a)
triangular = [sum(l[:i+2]) for i in l] # sum of 1, 2 and 1,2,3 and 1,2,3,4 and etc
print([a[x: y].ljust(i, ' ') for i, (x, y) in enumerate(zip([0] + triangular, triangular), 1) if a[x: y]])

['a', 'bc', 'def', 'ghij', 'klmno', 'pqrstu', 'vwxyz  ']

找出三角形的数字,做一个列表理解和填补空格,如果长度是不正确的。

Balaji 4 年前

a = 'abcdefghijklmnopqrstuvwxyz'
inc = 0
output = []
for i in range(0, len(a)):
    print(a[inc: inc+i+1])
    inc = inc+i+1
    if inc > len(a):
        break
    output.append(a[inc: inc+i+1])        
        
print(output)

['b', 'de', 'ghi', 'klmn', 'pqrst', 'vwxyz']