"Re: [xml] xmlParseChunk with UTF-16LE fails on special occasion"


On 11/28/2003 5:21 PM Kasimier Buchcik wrote:

Daniel Veillard wrote:

Can you restate cleanly what the problem is.

Yes: I have an application with a XML-document hold by a DOMString
encoded in UTF-16LE and
need it to be parsed with the push parser.

You have an UTF-16 entity. It's labelled with encoding= which is
not compatible with UTF-16. That encoding could be ignored
if you passed the real encoding used "UTF-16" at parser creation.

I seem to be confused here. How can I pass the real encoding (UTF-16LE) 
at parser creation time?

Yes, yes, yes :-) You mean I can give the context I get from
"xmlCreatePushParserCtxt" an UTF-16LE
encoding handler and I would get my lovely DOM, regardless of the
declared encoding? If that works, it would be the solution for my little
needs ;-)

I tried this but it seems not to be the correct way, since the parser 
context does not take an encoding handler (or did I miss something here?).

That how HTTP encoding information in header are used to override
the encoding= declaration is availble. That should work. Still
the internal format will be UTF-8 once parser.

Does someone have an example at hand for doing this?

At last: I don't have to use the push parser necessarily, for I just 
used it to fake easily the first 4 bytes of an UTF-16 encoded entity, 
which are autodetected by libxml2. If I could set the "real" encoding, I 
would prefer to use "xmlCreateMemoryParserCtxt" and "xmlParseDocument".



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]