Re: [xml] utf-8 encoding and xmlSAXParseMemory
- From: Daniel Veillard <veillard redhat com>
- To: Olivier Sirven <olivier elma fr>
- Cc: xml gnome org
- Subject: Re: [xml] utf-8 encoding and xmlSAXParseMemory
- Date: Tue, 2 May 2006 08:46:04 -0400
On Tue, May 02, 2006 at 02:25:00PM +0200, Olivier Sirven wrote:
Le Mardi 2 Mai 2006 14:02, Daniel Veillard a écrit :
Then you can't use an XML parser, by definition. You must reject those
feeds or stop pretending being XML compliant.
Ok...just to make sure I am understanding you....a software like mozilla can
render an xml tree even if there's encoding problem...so you tell me it is
because it does not use a xml parser? And if I want to be able to do the same
I should not rely on libxml?
1/ you dropped the encoding declarations from the header and forgot to
use one of the parser interface where is can be provided
Read appendix F of the XML spec. Reread it. Reread it again.
Reread it tomorrow, then you will be able to start coding some
code about handling encoding.
2/ it's possible the data ain't XML in the fist place and they use a
3/ it's possible they fixed encoding ahead of the parser.
They use expat for XML parsing, it will break like libxml2 on encoding
errors with non recoverable fatal error. Either they work around expat,
or they don't use it for the data, or you misunderstood something in the
processing. I'm not guessing, but some options are more likely than others.
But stating "I have a parser error, how can I work around it" will bring
back the same canned answer from me !
Daniel Veillard | Red Hat http://redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
] [Thread Prev