Re: [xml] Parsing invalid characters or entities ref



On Mon, Nov 03, 2003 at 07:13:22PM +0100, GARNIER Pierre wrote:
Hi all,

When I parse an UTF8 XML document with the xmlTextReader api, if the parser
encounter the character wich code is 0x03 the parsing is stopped.
Is there a way to allow the parser to ignore this character and continue
parsing?

  Hum, no. Your document is not XML and the spec instruct to stop parsing
immediately, and I totally support that behaviour. We don't want XML parsing
to become as unreliable as HTML parsing. On a fatal error the imput must be 
dropped, and the source must be fixed. There is an xmlRecoverFile()
interface which allows to fix a broken file by generating a tree which
you can then save, but it should really be limited to repairing broken files,
the notion of ignoring an error on the fly is really not suitable in an
XML processing.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]