[xml] Problem with CDATA entities
- From: Nic Ferrier <nferrier tapsellferrier co uk>
- To: xml gnome org
- Subject: [xml] Problem with CDATA entities
- Date: Wed, 18 May 2005 22:16:44 +0100
The problem in this article is related to the Debian libxml2, here's
the version report of xmllint:
xmllint: using libxml version 20616
compiled with: DTDValid FTP HTTP HTML C14N Catalog XPath XPointer XInclude Iconv Unicode Regexps Automata
Schemas
I'm having a problem with CDATA entities. You can see the same problem
by doing this:
xmllint http://www.oreillynet.com/meerkat/?_fl=rss10&t=ALL&c=5136
In other words download the O'Reilly ONJAVA RSS feed. This feed uses
an HTML DTD include like this:
<!DOCTYPE rdf:RDF [
<!ENTITY % HTMLlat1 PUBLIC
"-//W3C//ENTITIES Latin1//EN//HTML"
"http://www.w3.org/TR/PR-html40/HTMLlat1.ent">
%HTMLlat1;
]>
The w3 dtd has this in it:
<!ENTITY nbsp CDATA " " -- no-break space = non-breaking space,
U+00A0 ISOnum -->
<!ENTITY iexcl CDATA "¡" -- inverted exclamation mark, U+00A1 ISOnum -->
<!ENTITY cent CDATA "¢" -- cent sign, U+00A2 ISOnum -->
<!ENTITY pound CDATA "£" -- pound sign, U+00A3 ISOnum -->
<!ENTITY curren CDATA "¤" -- currency sign, U+00A4 ISOnum -->
<!ENTITY yen CDATA "¥" -- yen sign = yuan sign, U+00A5 ISOnum -->
<!ENTITY brvbar CDATA "¦" -- broken bar = broken vertical bar,
U+00A6 ISOnum -->
And the error from xmllint one gets is related directly to this:
http://www.w3.org/TR/html40/HTMLlat1.ent:12: parser error : Entity value required
<!ENTITY nbsp CDATA " " -- no-break space = non-breaking space,
^
http://www.w3.org/TR/html40/HTMLlat1.ent:12: parser error : Space required before 'NDATA'
<!ENTITY nbsp CDATA " " -- no-break space = non-breaking space,
^
http://www.w3.org/TR/html40/HTMLlat1.ent:12: parser error : xmlParseEntityDecl: entity nbsp not terminated
<!ENTITY nbsp CDATA " " -- no-break space = non-breaking space,
^
This is clearly wrong, the CDATA is declaring that the entity is not
to be parsed further. Expanding nbsp as declared above for example
will result in:
=> ' '
whereas:
<!ENTITY nbsp " ">
will expand to:
=> ' '
Interestingly, this:
http://www.flightlab.com/~joe/sgml/cdata.html
suggests that there is common confusion about CDATA entities.
Nic Ferrier
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]