Re: [xml] Is it possible to skip illegal UTF-8 characters when parsing?
- From: Daniel Veillard <veillard redhat com>
- To: Steinar Bang <sb dod no>
- Cc: xml gnome org
- Subject: Re: [xml] Is it possible to skip illegal UTF-8 characters when parsing?
- Date: Mon, 12 Aug 2002 05:25:42 -0400
On Fri, Aug 09, 2002 at 09:51:31AM +0200, Steinar Bang wrote:
Platform: Intel PIII, RedHat 7.2, gcc 2.96 (RPM version number 2.96-98),
libxml2 2.4.2
Is it possible to make libxml2 skip an illegal UTF-8 character, and
continue parsing, instead of stopping the parsing at this point?
Just getting a "." instead of the actual character is OK.
Well, no, the specification is very clear about it, it's a fatal error
and from that point the parser should not provide any more data to the
application.
Your data is not XML :-\
The workaround was to change everything in the incoming data <0x20,
and not one of 0x9, 0xA, or 0xD to a space, before passing it on to
the libxml2 parser, but the preferred solution would be to have
libxml2 handle it.
it's probably the best way to do it
Daniel
--
Daniel Veillard | Red Hat Network https://rhn.redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]