Re: [xml] HTML parsing with libxml2
- From: PaweÅ PaÅucha <pawel praterm com pl>
- To: Macy Gasp <macygasp gmail com>
- Cc: xml gnome org
- Subject: Re: [xml] HTML parsing with libxml2
- Date: Fri, 05 Aug 2005 15:01:24 +0200
So, basically, how can I make libxml2 parse the document and ignore the
character encoding (or fallback to a default encoding and continue, on
error)? Or how can I make it simply ignore any unknown characters?
I really need to use libxml and "out-of-range" characters are messing
the parsing :(
libxml is an XML parser, do not require it to parse IE-ready html code ;-)
You can always clean the document on your own before passing it to
libxml2. Or you can use libtidy or similar tool to clean your code.
P.P>
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]