Re: [xml] parsing html with libxml2

So, is there a way with libxml2 html parser to ignore
these errors?

 --- Daniel Veillard <veillard redhat com> wrote: 
On Fri, Nov 12, 2004 at 01:52:00PM -0500, JASON
JESSO wrote:
I have saved an html page to a file from my
When I parse the html with libxml2 (python) a get
lot of html parser errors.

I looked at the html document and there are a
of mismatched tags and the sort.

I tried several web browsers and they all have no
problem with it.

Why is that?

  Because "real html" i.e. what you find on the web
is usually full
of errors, just that browsers tends to handles those
errors in the
same way which makes the thing kind of work...  This
real mess is the
reason why when designing XML they instructed that
parser must fail
as soon as the hit an error and stop returning data
from that point.


Daniel Veillard      | Red Hat Desktop team
veillard redhat com  | libxml GNOME XML XSLT toolkit | Rpmfind RPM search engine

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]