一种解决方案是将第一段作为文本,替换为
(
和
)
并再次解析它。例如:
import requests
from bs4 import BeautifulSoup
url = "https://en.wikipedia.org/wiki/Epistemology"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
txt = (
str(soup.select_one(".mw-parser-output p:has(a)"))
.replace("(", "<bracket>")
.replace(")", "</bracket>")
)
soup = BeautifulSoup(txt, "html.parser")
a = soup.find(lambda tag: tag.name == "a" and not tag.find_parent("bracket"))
print(a)
打印:
<a href="/wiki/Outline_of_philosophy" title="Outline of philosophy">branch of philosophy</a>