代码之家 › 专栏 › 技术社区 › user1631306

在python中查找匹配单词的上游5个单词

regex python

user1631306 · 技术社区 · 6 年前

例子。我有绳子

我想搜索“老鼠”,然后得到找到的“老鼠字”上游的4个字

我试过用

re.search(r'\brat\b', " This is the most Absurd rat in the history")

但是它给了我空间位置,比如span(25,28),但是我怎么用它来得到单词呢。如果我想知道单词的位置,那么我可以简单地得到4个索引词。

3 回复 | 直到 6 年前

Ajax1234 6 年前

你可以用 re.findall

s = "This is the most Absurd rat ever in the history"
print(re.findall('^[\w\W]+(?=\srat)', s)[0].split()[-4:])

['is', 'the', 'most', 'Absurd']

编辑2:

如果你在寻找这四个词来追踪 "rat" ,您可以使用 itertools.groupby :

import itertools
s = "Some words go here rat This is the most Absurd rat final case rat"
new_data = [[a, list(b)] for a, b in itertools.groupby(s.split(), key=lambda x:x.lower() == 'rat')]
if any(a for a, _ in new_data): #to ensure that "rat" does exist in the string
  results = [new_data[i][-1][-4:] for i in range(len(new_data)-1) if new_data[i+1][0]]
  print(results)

[['Some', 'words', 'go', 'here'], ['is', 'the', 'most', 'Absurd'], ['final', 'case']]

Eric Duminil 6 年前

(?:\S+\s){4}(?=rat\b) 可能接近你想要的:

>>> sentence = "This is the most Absurd rat in the history"
>>> import re
>>> re.findall(r'(?:\S+\s){4}(?=rat\b)', sentence, re.I)
['is the most Absurd ']
>>> re.findall(r'(?:\S+\s){4}(?=rat\b)', "I like Bratwurst", re.I)
[]
>>> re.findall(r'(?:\S+\s){4}(?=rat\b)', "A B C D rat D E F G H rat", re.I)
['A B C D ', 'E F G H ']

example .

mVChr 6 年前

rat , findall

import re
s = 'This is the most absurd rat ever in the history of rat kind I tell you this rat is ridiculous.'
answer = [sub.split() for sub in re.findall(r'((?:\S+\s*){4})rat', s)]
# [['is', 'the', 'most', 'absurd'],
#  ['in', 'the', 'history', 'of'],
#  ['I', 'tell', 'you', 'this']]

上一个答案:

你可以 split :

import re
s = 'This is the most Absurd rat ever in the history'
answer = re.split(r'\brat\b', s, 1)[0].split()[-4:]
# => ['is', 'the', 'most', 'Absurd']

[0] 到 [1] 和 [-4:] 到 [:4] . 您还需要添加一些代码来检查 老鼠

推荐文章

Google User · Django管理员在`list_display中未显示`creation_date`字段`

4 月前

user29747013 · 如何创建一个新的数据框架,其中包含原始数据框架中列的聚合列?

4 月前

ÎÎÎ½Î· ÎÎ®Î¹Î½Î¿Ï · Python lxml.html语法错误:使用lxml find时XPATH的谓词无效

4 月前

user29715306 · from_users=和chats=电视节目中的差异

4 月前

Redshoe · 当执行numpy.genfromtxt()时,python是否会读取文件的所有行?

4 月前

RASEL MAHMUD · 为什么以及如何在is_even()函数内的IF条件中递归X变量在满足0后递增?[副本]

4 月前

prayner · 更新嵌套字典包含列表中的项

4 月前

Bringo Jr · 我可以在O(n)中解决这个问题吗?

5 月前

Dave · 如何在for循环中修改列表值

5 月前

Shukurullox Komiljonov · 从记录中获得相互和解。使用SQL

5 月前