代码之家  ›  专栏  ›  技术社区  ›  Jonathan Scialpi

如何在python/BeautifulSoup中的列表元素上使用FIND()-我得到了Nonetype错误

  •  0
  • Jonathan Scialpi  · 技术社区  · 10 年前

    好的,这段代码有效:

    from bs4 import BeautifulSoup
    import urllib
    import re
    
    htmlfile = urllib.urlopen(MY SITE URL SITS HERE)
    soup = BeautifulSoup(htmlfile.read())
    
    title = soup.find('p', {'class': 'deal-title should-truncate'}).getText()  
    print "Title: " + str(title)
    

    但是上面的代码只给了我第一个结果。我希望每次查找都能在整个站点中循环。为此,我尝试使用一个全面的循环来查找每次出现的数字标记(因为这个段落标记总是位于数字标记之间)。这样,我只能专注于图中的内容。然而,当我尝试以下内容时:

    from bs4 import BeautifulSoup
    import urllib
    import re
    
    htmlfile = urllib.urlopen(MY WEBSITE URL SITS HERE)
    soup = BeautifulSoup(htmlfile.read())
    
    deals = [figure for figure in soup.findAll('figure')]
    
    for i in deals:
        title = i.find('p', {'class': 'deal-title should-truncate'}).getText()  
        print "Title: " + str(title)
    

    我收到此错误:

    回溯(最近一次调用):文件“C:\Python27\blah.py”,行 11英寸 title=i.find('p',{'class':'交易标题应截断'}).getText()AttributeError:“NoneType”对象没有 属性'getText'

    现在我正在尝试:

    from bs4 import BeautifulSoup import urllib import re
    
    htmlfile = urllib.urlopen(MY SITE SITS HERE) soup = BeautifulSoup(htmlfile.read())
    
    deals = soup.findAll('figure')
    
    for i in deals:
        title = i.find('p', {'class': 'deal-title should-truncate'})
        if (title == None):
            title = "NONE"
        else:
            title = title.getText()
        print "Title: " + str(title)
    

    现在错误是:

    回溯(最近一次调用):文件“C:\Python27\blah.py”,行 16英寸 print“Title:”+str(Title)UnicodeEncodeError:“ascii”编解码器无法在位置12编码字符u“\u2013”:序号不在 范围(128)

    1 回复  |  直到 10 年前
        1
  •  0
  •   Jonathan Scialpi    10 年前

    最终答案和特别喊话 21点 寻求帮助

    from bs4 import BeautifulSoup
    import urllib
    import re
    
    htmlfile = urllib.urlopen(MY SITE SITS HERE)
    soup = BeautifulSoup(htmlfile.read())
    
    deals = soup.findAll('figure')
    
    for i in deals:
        title = i.find('p', {'class': 'deal-title should-truncate'})
        if (title == None):
            title = "NONE"
        else:
            title = title.getText()
        print "Title: " + title