[I post to xml list rather than following to xslt one as this is a libxml2 issue] Le 03/07/01 01:55:48, Raphael Hertzog a écrit :
Strangely the character '©' is causing problem when it's included in a XML file via XInclude. Even if it's a perfectly legal ISO-8859-1 entity.
More generally, the problem is encoding handling in «parse="text"» included entities.
In the meantime, I'll avoid © in the included files.
Just encode them as UTF-8 (as it is the internal encoding of libxml), or use XML files and XPointer...
The problem seems to be that when inluding a file with XInclude it makes an assumption about the encoding (it ignores the encoding attribute ?). Looks like the issue is known (excerpt from xinclude.c in xmlXIncludeLoadTxt) : /* * Load it. * Issue 62: how to detect the encoding */ [...] /* * TODO: if the encoding issue is solved, scan UTF8 chars instead */
As a suggestion for solving the issue: what about adding an «xmlCharEncoding enc» argument to xmlXIncludeLoadTxt, based on the encoding attribute of the xi:include element. This value is computed with xmlParseCharEncoding(). The attached patch uses this. Unfortunately, this doesn't solve the problem: xmlParserInputBufferRead in xmlXIncludeLoadTxt returns 0 while it actually has read 2 chars (compiled with -DDEBUG_INPUT -DDEBUG_ENCODING -DDEBUG_XINCLUDE): ptittom:~/gnome-xml$ ./xmllint --xinclude ../test.xml 2>&1 ... Found registered handler for encoding ISO-8859-1 converted 2 bytes to 3 bytes of input I/O: read 0 chars, buffer 3/4000 closed the encoding handler ... As I'm really not used to xmlIO stuff, I can't do further debugging... Just a little note about the patch: I used a cast from xmlChar to const char for the encoding value, as it is supposed to contain only plain ascii chars; maybe I should have used UTF8Toisolat1() or something like that (UTF8Toascii is declared static)... HTH Tom.
Attachment:
xinclude-encoding.diff
Description: Text document