我试图用
urllib语言库
炼金术
但当插入/检索html时,似乎有些东西在这个过程中被混淆了
使用:
SQLAlchemy 1.2、Python 3.6、postgres 10、urllib
class ParksTxState(Base):
__tablename__ = 'parks_tx_state'
id = Column(Integer, primary_key=True)
park_name = Column(Text)
url = Column(Text)
html = Column(Text)
engine = create_engine("postgresql://<user>:<pass>@localhost/<db>", echo=False)
Session = sessionmaker(bind=engine)
session = Session()
url = 'https://tpwd.texas.gov/state-parks/abilene'
html = request.urlopen(url).read()
print(html)
# b'<!DOCTYPE html>\n<html xmlns="http://www.w3.org/1999/xhtml">\n<head>\n...
# so far so good...
newpark = ParksTxState()
newpark.html = html
print(newpark.html)
# b'<!DOCTYPE html>\n<html xmlns="http://www.w3.org/1999/xhtml">\n<head>\n...
# so we're still good here before committing....
session.add(newpark)
session.commit()
print(newpark.html)
# \x3c21444f43545950452068746d6c3e0a3...
# and here is where the garbage comes in.
出于某种原因,HTML被存储为一个长字符串。。
\x3c21444f43545950452068746d6c3e0a3c68746d6c20786d6c6e733d22687474703a2f2f7777772e7...
echo=True
看到insert语句是正确的。
我做错什么了?