Re: [xml] encoding support with xmlParseInNodeContext



On Wed, Feb 03, 2010 at 08:34:09PM -0800, Aaron Patterson wrote:
I can't seem to pass an encoding to xmlParseInNodeContext.  This is
problematic when dealing with UTF-8 HTML documents.  I can tell
libxml2 what encoding to use when originally parsing the document, but
it looks like that is completely ignored when using
xmlParseInNodeContext.  Reference nodes in HTML documents completely
ignore the original document encoding and use ISO-8859-1.

Here is a sample program to illustrate the problem:

http://pastie.org/808860

I tried putting together a patch, and it didn't seem to work:

http://pastie.org/808862

Ideally, I would like a function similar to xmlParseInNodeContext, but
one that takes an encoding as a parameter.  Thanks!

  Rather than add Yet Another Entry Point, I think the most logical
is to parse using the encoding from the document, since it's an "in
context" parsing, i.e. parsing as if the fragment was coming from that
document. The encoding switch is a bit harder than what you hoped for,
but it's not that hard, the patch enclosed seems to do it for me, please
have a try.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/

Attachment: in_context_encoding.patch
Description: Text document



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]