Greetings list! I'm using libxml2 (specifically the HTMLparser/tree modules, and the xpath library) to perform transformation operations on HTML input files, and have run into a character encoding issue: Specifically, I have two HTML documents, one in 8859-1 encoding, and the other in UTF-8. First I parse both documents into DOM trees. Then, I'm performing an XPath on the 8859-1 document, cloning the resultset nodes using "xmlCopyNodeList," then using "xmlAddNextSibling" to add the 8859-1 document content into a document that was originally UTF-8 encoded. This results in the 8859-1 content not being correctly serialized if I output the UTF-8 document. Special characters are garbled, etc. Based on the libxml2 encodings webpage ( http://xmlsoft.org/encoding.htmlhttp://xmlsoft.org/encoding.html ), it seems that libxml2 converts all character encodings to UTF-8 internally. Therefore unless I'm misunderstanding something, the 8859-1 document should be in UTF-8 after parsing. Is there any reason why this serialization problem should occur, if both the 8859-1 document and UTF-8 document are converted to native UTF-8 by libxml2? Shouldn't it "just work"? My impression is that you can freely copy cloned nodesets between documents, as they're all internally in UTF-8. Careful review of the libXML2 encodings page seems to agree with this assertion, so I'm quite stumped. Any help on this is appreciated, thank you! D. Platt |