Re: [xml] Problem with character references in range � through  inclusive



On Thu, Mar 25, 2004 at 05:39:33PM -0600, David C. Hoos wrote:
I am having difficulty with the function xmlStringGetNodeList() (called from
xmlNodeSetContent() ).  When I submit a content string like the following:

�0¸gØ+Û /
fiæ‘qF¯Í
ìlwˆ∫ûÉ“5
ˆ—.=Åï’E2Æx
Ó\KTn—u©˜T

What I get in the resulting xml is the following:

0¸gØ+Û /
fiæ‘qF¯Í
ìlwˆ∫ûÉ“5
ˆ—.=Åï’E2Æx
Ó\KTn—u©˜T

This appears to me to be a bug -- or am I missing something?

Thanks for any light you can shed on this.

  Hum ...  is not in the allowed character range of XML
(see production 4 of the spec at http://www.w3.org/REC-xml IIRC)
and using xmlNodeSetContent() with such a content is an error,
but libxml2 doesn't do the checking at that level.
 Make 100% sure that when you manipulate XML document content,
the strings are valid UTF8 encoded XML content, otherwise you 
will get errors either at serialization time or when reloading
the output.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]