代码之家 › 专栏 › 技术社区 › kevinabraham

如何使用提要解析器python解析xml提要?

feedparser rss xml python

kevinabraham · 技术社区 · 7 年前

None 返回。我不确定我错过了什么。这是我的代码:

import feedparser

def rss(self):
    rss = 'https://news.google.com/news?q=fashion&output=rss'
    feed = feedparser.parse(rss)
    for key in feed.entries: 
        return key.title

print(key) 显示器 none 和 print(len(feed.entries)) 没有一个

print(feed)
{'feed': {}, 'entries': [], 'bozo': 1, 'bozo_exception': URLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)'),)}

print(feedparser)
<module 'feedparser' from '/Users/User_name/python-projects/my_env/lib/python3.6/site-packages/feedparser.py'>

2 回复 | 直到 7 年前

kevinabraham 7 年前

发现问题实际上是SSL握手通过添加 ssl._create_default_https_context = ssl._create_unverified_context .

import feedparser
import ssl
if hasattr(ssl, '_create_unverified_context'):
    ssl._create_default_https_context = ssl._create_unverified_context
rss = 'https://news.google.com/news?q=fashion&output=rss'
feed = feedparser.parse(rss)

print(feed)

Ged Flod 3 年前

试试下面的基本代码,它对我来说很好,当我运行它时,在提要中给了我10个条目。

feedparser 来自pip

pip install feedparser

import urllib2
import feedparser

url = "https://news.google.com/news?q=fashion&output=rss"
response = urllib2.urlopen(url).read()

print response

d = feedparser.parse(response)
print len(d.entries)
for item in d.entries:
    print "------"
    print item.title
    if 'subtitle' in item:
        print item.subtitle
    print item.link
    print item.description
    print item.published
    print item.id
    print item.updated
    if 'content' in item:
        print item.content

或者,粘贴您正在运行的完整代码,我来看看。

推荐文章

JobProcessTask · 如何读取此xpath表达式?

2 年前

Sven K · 无法访问XML数据结构中的“数据”:“名称属性>数据”

2 年前

sklal · 在Python中从S3存储桶读取xml文件——只存储最后一个文件的内容

2 年前

MBF · PHP导入/解析XML文件内容保存到数据库

2 年前

TenkMan · SQL Server XML嵌套值查询表单990

2 年前

lam62 · 如何使用XML从XHTML/XML中提取相关数据。dom。小型化

2 年前

Mohan.Murali.Peddini · XSLT模板循环记录

2 年前

mayo0o · 检查元素的总和

3 年前

crichavin · 排除XSLT的(1.0)行返回和文本输出中的额外空白

3 年前

Crimp · 从Excel导出后,在XML文件和PowerShell输出中发现奇怪字符:

3 年前