Re: [xml] Push-parsing Unicode with LibXML2



On Tue, Feb 14, 2006 at 02:56:40AM -0800, Eric Seidel wrote:
On Feb 14, 2006, at 2:32 AM, Daniel Veillard wrote:
 To me the most logical would be to do surgery on your input stream
you are modifying it by changing its encoding, you should then also
change or remove the encoding declaration of the xmlDecl if present.
 However to follow appendix F2 the user provided encoding should
override the detected one, so that could be considered a libxml2 bug,
I'm just really worried about breaking existing code in changing this.

I've found a (hackish) solution to the problem.  By calling  
xmlSwitchEncoding before every chunk (and passing the proper utf-16  
variant), I'm able to make my existing code work:

  Hum, it could make sense before the first and possibly second
chunk, any further call should not modify anything and will just
add some penalty to processing.
  From a library perspective though it's not really a satisfactory 
solution, I will try to find time exploring the externally provided
encoding issue.

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]