[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [xml] Is it possible to skip illegal UTF-8 characters when parsing?
- From: Daniel Veillard <veillard redhat com>
- To: Steinar Bang <sb dod no>
- Cc: xml gnome org
- Subject: Re: [xml] Is it possible to skip illegal UTF-8 characters when parsing?
- Date: Mon, 12 Aug 2002 05:25:42 -0400
On Fri, Aug 09, 2002 at 09:51:31AM +0200, Steinar Bang wrote:
> Platform: Intel PIII, RedHat 7.2, gcc 2.96 (RPM version number 2.96-98),
> libxml2 2.4.2
>
> Is it possible to make libxml2 skip an illegal UTF-8 character, and
> continue parsing, instead of stopping the parsing at this point?
>
> Just getting a "." instead of the actual character is OK.
Well, no, the specification is very clear about it, it's a fatal error
and from that point the parser should not provide any more data to the
application.
Your data is not XML :-\
> The workaround was to change everything in the incoming data <0x20,
> and not one of 0x9, 0xA, or 0xD to a space, before passing it on to
> the libxml2 parser, but the preferred solution would be to have
> libxml2 handle it.
it's probably the best way to do it
Daniel
--
Daniel Veillard | Red Hat Network https://rhn.redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]