您可以尝试使用df.apply和set intersection查看文本列和单词列表中同时出现的单词。
您需要考虑当文本列中出现多个单词时会发生什么情况
def word_finder(x):
df_words = set(x.split(' '))
extract_words = word_set.intersection(df_words)
return ', '.join(extract_words)
df = pd.DataFrame(data = {'text' : ['hello you person', 'you have a dog', 'the bird flew', 'the horse is here', 'bird bird bird', 'dog and cat']})
word_set = {'dog', 'cat', 'horse', 'bird'}
df['extract'] = df.text.apply(word_finder)
输出
text extract
0 hello you person
1 you have a dog dog
2 the bird flew bird
3 the horse is here horse
4 bird bird bird bird
5 dog and cat dog, cat