Re: [xml] Problem with CDATA entities



On Wed, May 18, 2005 at 11:49:53PM +0100, Nic Ferrier wrote:
I apoligize Daniel. You are, of course, quite right.

  Don't worry I tend to goof magnificiently from time to time too :-)

The clue is even there in the DTD file where it says SGML. It is also
referenced in the HTML 4.01 specification as SGML.

The XHTML specification also mentions it and provides the correct URL
for an XML valid DTD:

   http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent

I've asked O'Reilly to switch their feed to this.

  That should work way better :-)

As you say, it probably shows that people are using cruddy tools to
parse RSS which is a shame.

  yes that's the core of the problem. This DTD brokeness is just an example,
but the blame is on the generation side, the RSS generator don't garantee
well-formedness and people who want to process them are stuck, the right
thing is to get generators to be fixed as you did.

I've written an RSS aggregator using libxml2 (and libxslt) from a
mixture of shell and python. It works quite well saving issues like
this.

I'll release it as free software at some point.

  Okay, I think  Mark Pilgrim did one too http://diveintomark.org/
with Python and libxml2.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]