我根本不会使用regex-python
module re
不处理重叠范围…
text = """2015 Outlook The Company is providing the following outlook for 2015 in lieu of formal financial guidance at this time. This outlook does not include the impact of any future acquisitions and transaction-related costs. Revenues - Based on the revenues from the fourth quarter of 2014, the addition of new items at our some facility and the previously opened acquisition of Important Place, the Company expects utilization of the current 100 items to remain in some average"""
lookfor = "outlook"
# split text at spaces
splitted = text.lower().split()
# get the position in splitted where the words match (remove .,-?! for comparison)
positions = [i for i,w in enumerate(splitted) if lookfor == w.strip(".,-?!")]
# printing here, you can put those slices in a list for later usage
for p in positions: # positions is: [1, 8, 21]
print( ' '.join(splitted[max(0,p-26):p+26]) )
print()
输出:
2015 outlook the company is providing the following outlook for 2015 in lieu of formal financial guidance at this time. this outlook does not include the impact
2015 outlook the company is providing the following outlook for 2015 in lieu of formal financial guidance at this time. this outlook does not include the impact of any future acquisitions and transaction-related costs.
2015 outlook the company is providing the following outlook for 2015 in lieu of formal financial guidance at this time. this outlook does not include the impact of any future acquisitions and transaction-related costs. revenues - based on the revenues from the fourth quarter of 2014, the
通过迭代被拆分的单词,您可以得到位置并对被拆分的列表进行切片。确保从开始
0
即使是在
p-26
那么低
零
,否则您将无法获得任何输出。(从-4开始表示从字符串结束)