RE: [xml] htmlParser:bug -> document.write('</td></tr></table>')



www.nba.com
www.nokia.com

there are some html sites that HTMLparser fail to parse properly because of the occurence
of the  "document.write('</td></tr></table>');" 
or "document.write('</tr></table>');"
inside script tags.

There is a function htmlParseScript() that is called from htmlParseContent(), but i doubt it
work properly(maybe..)...

I would apreciate if anybody could give a solution..
is it really bug or do i make any mistake..?


\Manos  


-----Original Message-----
From: xml-bounces gnome org [mailto:xml-bounces gnome org]On Behalf Of
Manos Moschous
Sent: Tuesday, September 28, 2004 2:36 PM
To: Emmanuel Saracco; libXml2
Subject: RE: [xml] htmlParser:bug ->
document.write('</td></tr></table>')




-----Original Message-----
From: xml-bounces gnome org [mailto:xml-bounces gnome org]On Behalf Of
Emmanuel Saracco
Sent: Tuesday, September 28, 2004 12:43 PM
To: libXml2
Subject: Re: [xml] htmlParser:bug ->
document.write('</td></tr></table>')


Le mar 28/09/2004 Ã 11:35, Manos Moschous a Ãcrit :

It get confused when there is something like this below in the html code..


index.html:173: error: Opening and ending tag mismatch: td and div
document.write('</td></tr></table>');

Also, what is happening with <nobr> tag, that is usually used...?

did you tried with HTML_PARSE_NOWARNING and HTML_PARSE_NOERROR?


i dont care about the errors, i care about the right parsing (and right consruction of tree).
it doesn't make the right tree and it get confused(it closes the td, tr and table tags)...


i think that the parser must ignore the content of scripts...
is there any solution in this problem..?

bye


\Manos
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]