Re: [xml] premature exit when parsing

On Mon, Jun 25, 2001 at 05:23:07PM +0100, Prashanth Naidu wrote:

The last email I sent was blocked because I had pasted the html into the
email and the mail server didn't like the javascript that was embedded. I've
attached it this time.

If testHTML of v2.3.11 is run on the html code below, it stops parsing at
about line 223
(well before the end of the document). However, testHTML from v1.8.7 has no
such problem.
The html below is from the Reuters business news page, the same problem is
encountered with
their world news page.

  this is broken beyond what I feel should be handled by libxml HTML
parser, they have script and style started first, then put the DOCTYPE there
open html and head tags, then script again and two more scripts, 
close the head and at that point libxml surrender,

  point them to an HTML validator, sorry, I won't fix this ! Maybe
some kind soul can spend time on trying to cope with this, I may even
take the patch but I won't try to fix this, this is ugly.


Daniel Veillard      | Red Hat Network
veillard redhat com  | libxml Gnome XML XSLT toolkit | Rpmfind RPM search engine
Sep 17-18 2001 Brussels Red Hat TechWorld

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]