我们有一个过程,在这个过程中,XML通过ESMTP在电子邮件正文中传输给我们。电子邮件正文的字符集指定为
ISO-859-1
, and no encoding is specified for the XML. 根据协议,默认值为
UTF-8
.
问题是,我们的XML解析器在遇到_?字符时抛出异常,因为它认为它在解析
UTF-8
以及
UTF-8
是2字节,而不是1
ISO-859-1
.
-
我们应该假设尸体是
ISO-859-1
从而重写XML编码(
UTF-8
)?
-
更主观地说,电子邮件是否发送错误,我们是否更好地解释为
UTF-8
在我们这边,还是问谁在正确和一致地指定编码?
下面是一个包含XML的电子邮件正文示例:
Delivered-To: ...
Received: ...
Received: ...
Return-Path: ...
Received: ...
Received-SPF: ...
Authentication-Results: ...
Received: ...
Thread-Topic: ...
From: ...
To: ...
Subject: ...
Date: ...
Message-ID: ...
MIME-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Mailer: Microsoft CDO for Windows 2000
Content-Class: urn:content-classes:message
Importance: normal
Priority: normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.4325
<?xml version="1.0"?>
...
<comments>Super Widget®</comments>
...