re模块不支持重复捕获,因此
'(\t(.*)?\t(.*)?\n)*'
仅保留最后一次组捕获。
'\t(\w+)\s+([^\n]*)\n\'
因此,考虑到数据的结构,一个可能的解决方案是创建一个正则表达式,该正则表达式将匹配任何一种模式:
regex = r'define\s+(\w+)\s+\{\n|\t(\w+)\s+([^\n]*)\n|\t\}'
matches = re.finditer(regex, TEST_STR, re.DOTALL)
for match in matches:
for groupNum in range(0, len(match.groups())):
groupNum = groupNum + 1
if match.group(groupNum):
print("Group {}: {}".format(groupNum, match.group(groupNum)))
返回:
Group 1: host
Group 2: address
Group 3: 123.123.123.123
Group 2: passive_checks_enabled
Group 3: 1
Group 1: service
Group 2: service_description
Group 3: Crondaemon
Group 2: check_command
Group 3: check_nrpe_1arg!check_crondaemon