Re: [xml] Bug in HTML parser output?
- From: Daniel Veillard <veillard redhat com>
- To: Matt Sergeant <matt sergeant org>
- Cc: xml gnome org
- Subject: Re: [xml] Bug in HTML parser output?
- Date: Fri, 26 Apr 2002 06:27:55 -0400
On Fri, Apr 26, 2002 at 11:16:51AM +0100, Matt Sergeant wrote:
In using libxml2's HTML parser to create valid XML, I noticed a "bug"...
xmllint --html --format http://www.messagelabs.com/VirusEye/ | xmllint -
Croaks on the bad ---> comment in the HTML.
Is there any way to make this just "work"?
hum, right this seems a loophole, the HTML parser is overly flexible to
be able to parse what's found on the net, but doesn't take corrective measures
to cleanup things like HTML comments
(yeah I know I should get them to fix thier nasty HTML too)
I wonder what's the best approach:
- fix the HTML importer
- fix the XML serializer
the second case sounds quite more generic, I would be tempted to go that
way. How urgent is this ?
Daniel
--
Daniel Veillard | Red Hat Network https://rhn.redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]