Re: [xml] Problem with CDATA entities
- From: Daniel Veillard <veillard redhat com>
- To: Nic Ferrier <nferrier tapsellferrier co uk>
- Cc: xml gnome org
- Subject: Re: [xml] Problem with CDATA entities
- Date: Wed, 18 May 2005 18:22:11 -0400
On Wed, May 18, 2005 at 10:16:44PM +0100, Nic Ferrier wrote:
I'm having a problem with CDATA entities. You can see the same problem
by doing this:
xmllint http://www.oreillynet.com/meerkat/?_fl=rss10&t=ALL&c=5136
In other words download the O'Reilly ONJAVA RSS feed. This feed uses
an HTML DTD include like this:
<!DOCTYPE rdf:RDF [
<!ENTITY % HTMLlat1 PUBLIC
"-//W3C//ENTITIES Latin1//EN//HTML"
"http://www.w3.org/TR/PR-html40/HTMLlat1.ent">
%HTMLlat1;
]>
Seems they seriously lack a QA department there.
The w3 dtd has this in it:
<!ENTITY nbsp CDATA " " -- no-break space = non-breaking space,
U+00A0 ISOnum -->
[...]
And the error from xmllint one gets is related directly to this:
http://www.w3.org/TR/html40/HTMLlat1.ent:12: parser error : Entity value required
Your XML file reference an SGML DTD fragment which has a different syntax.
Your XML is as a result not an XML file, it is not well formed, but only
a validating XML parser fetching the external subset can detect it.
As far as I can tell xmllint is right and the error message is quite accurate.
Interestingly, this:
http://www.flightlab.com/~joe/sgml/cdata.html
suggests that there is common confusion about CDATA entities.
In SGML ! You are using an XML parser. Show me how you generate
<!ENTITY nbsp CDATA " "
from the production [70] of
http://www.w3.org/TR/REC-xml/#NT-EntityDecl
Seems people are so used to digest any crap in RSS that they didn't even
managed to find this monstruosity any validating XML parser should show.
Blame them, not libxml2, thanks.
Daniel
--
Daniel Veillard | Red Hat Desktop team http://redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]