[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [xml] Problem with character references in range � through  inclusive
- From: "David C. Hoos, Sr." <david c hoos sr ada95 com>
- To: <xml gnome org>
- Subject: Re: [xml] Problem with character references in range � through  inclusive
- Date: Thu, 25 Mar 2004 22:52:33 -0600
Thank you for the quick reply.
I have two questions about your reply, vi
1. The URL you supplied (http://www.w3.org/REC-xml)
returns a 404 error.
2. My basic problem is that I need to be able to
encode any arbitrary octet sequence in the
XML element. All of the characters except those
in the range � through  inclusive.
I experimented with doing them in hex both with
and without leading zeroes, and in decimal, as
well. In every case every octet in the range
� through  inclusive is simply dropped.
So, the long and short of it is -- what is the correct
way to encode the octets in that range?
Again, thank you for any light you can shed on this.
David Hoos
----- Original Message -----
From: "Daniel Veillard" <veillard redhat com>
To: "David C. Hoos" <david c hoos sr ada95 com>
Cc: <xml gnome org>
Sent: March 25, 2004 5:51 PM
Subject: Re: [xml] Problem with character references in range � through  inclusive
> On Thu, Mar 25, 2004 at 05:39:33PM -0600, David C. Hoos wrote:
> > I am having difficulty with the function xmlStringGetNodeList() (called from
> > xmlNodeSetContent() ). When I submit a content string like the following:
> >
> > �0¸gØ+Û /
> > fiæ‘qF¯Í
> > ìlwˆ∫ûÉ“5
> > ˆ—.=Åï’E2Æx
> > Ó\KTn—u©˜T
> >
> > What I get in the resulting xml is the following:
> >
> > 0¸gØ+Û /
> > fiæ‘qF¯Í
> > ìlwˆ∫ûÉ“5
> > ˆ—.=Åï’E2Æx
> > Ó\KTn—u©˜T
> >
> > This appears to me to be a bug -- or am I missing something?
> >
> > Thanks for any light you can shed on this.
>
> Hum ...  is not in the allowed character range of XML
> (see production 4 of the spec at http://www.w3.org/REC-xml IIRC)
> and using xmlNodeSetContent() with such a content is an error,
> but libxml2 doesn't do the checking at that level.
> Make 100% sure that when you manipulate XML document content,
> the strings are valid UTF8 encoded XML content, otherwise you
> will get errors either at serialization time or when reloading
> the output.
>
> Daniel
>
> --
> Daniel Veillard | Red Hat Network https://rhn.redhat.com/
> veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
> http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
>
>
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]