代码之家 › 专栏 › 技术社区 › Cristian

解码utf-16字符串时出错

utf decode python-3.x python

Cristian · 技术社区 · 9 年前

我正在使用 python3.3 。我一直在尝试解码如下所示的某个字符串:

b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03\xed:\xf9w\xdaH\xd2?\xcf\xbc....

继续。但是每当我尝试使用 str.decode('utf-16') 我收到一个错误消息:

'utf16' codec can't decode bytes in position 54-55: illegal UTF-16 surrogate

我不太清楚如何解码这个字符串。

1 回复 | 直到 9 年前

unutbu 9 年前

gzipped数据 begins with \x1f\x8b\x08 所以我猜您的数据是gzipped的。尝试 gunzipping the data 解码之前。

import io
import gzip

# this raises IOError because `buf` is incomplete. It may work if you supply the complete buf
buf = b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03\xed:\xf9w\xdaH\xd2?\xcf\xbc'
with gzip.GzipFile(fileobj=io.BytesIO(buf)) as f:
    content = f.read()
    print(content.decode('utf-16'))

推荐文章

Akshay Madan · 在Python中生成1和0序列的SHA 1哈希的任何方法

3 年前

TobyRush · Unicode MySQL数据在XML中创建编码错误

7 年前

Kuang · c++在字符串中添加“\u”

7 年前

Artem · Java或Scala。如何将\x22之类的字符转换为字符串

7 年前

rv7284 · 字符串到UTF-32字符串

7 年前

kush Thakkar · 将字符串转换为UTF8String-Swift-iOS-XMLParser

8 年前

Piyush · 如果所有Java字符串都是UTF-16字符串,那么char数据类型的最大大小如何为2?

8 年前

Cristian · 解码utf-16字符串时出错

9 年前

Steven Tang Ti Khoon · Oracle的NLS_NCHAR_CHARACTERSET和NLS_CHARACTERNET之间的差异

9 年前

Kenneth Aalberg · C#-将剥离的UTF编码字符串转换回UTF

9 年前