问题是从文件中读取的行实际上包含12个字符:
\
,请
x
,请
e
,请
d
,请
\
,请
X
,请
b
,请
a
,请
\
,请
X
,请
B
和
D
,您要将其转换为3个字符
'\xed'
,'
\xba'
和'
\xbd'
.正则表达式可以帮助识别
逃脱
以开头的字符
\x
以下内容:
def unescape(string):
rx = re.compile(r'(\\x((?:[0-9a-fA-F]){2}))')
while True:
m = rx.search(string)
if m is None: return string
string = string.replace(m.group(1), chr(int(m.group(2), 16)))
可以使用它预处理从文件中提取的行(不要忘记导入
re
模块):
v_conf_file = 'excl_char_seq.lst'
with open(v_conf_file) as f:
seqlist = [ unescape(line.strip()) for line in fd ]
line = 'werú½66'
print ([ 1 for seqs in seqlist if seqs in line ])
当我控制的内容
seqlist
,我如愿以偿:
>>> print seqlist
['\xed\xba\xbd', '\xed\xa9\x81', '\xed\xba\x91']