On Thu, Feb 04, 2010 at 09:24:27AM -0800, Aaron Patterson wrote:
On Thu, Feb 4, 2010 at 12:53 AM, Daniel Veillard <veillard redhat com> wrote:
On Wed, Feb 03, 2010 at 08:34:09PM -0800, Aaron Patterson wrote:
I can't seem to pass an encoding to xmlParseInNodeContext.  This is
problematic when dealing with UTF-8 HTML documents.  I can tell
libxml2 what encoding to use when originally parsing the document, but
it looks like that is completely ignored when using
xmlParseInNodeContext.  Reference nodes in HTML documents completely
ignore the original document encoding and use ISO-8859-1.

Here is a sample program to illustrate the problem:

I tried putting together a patch, and it didn't seem to work:

Ideally, I would like a function similar to xmlParseInNodeContext, but
one that takes an encoding as a parameter.  Thanks!

 Rather than add Yet Another Entry Point, I think the most logical
is to parse using the encoding from the document, since it's an "in
context" parsing, i.e. parsing as if the fragment was coming from that
document. The encoding switch is a bit harder than what you hoped for,
but it's not that hard, the patch enclosed seems to do it for me, please
have a try.

Perfect.  It works great for me!  Thank you very much!

  Okay, pushed to head

Any suggestions for workarounds to older versions of libxml2?  I'm
tempted to copy this function to my C code, but I'd rather not if

  If the patch applies that should be fine,


