[xml] Re: Output not going to the end ?




[I post to xml list rather than following to xslt one as this is a libxml2
issue]

Le 03/07/01 01:55:48, Raphael Hertzog a écrit :
Strangely the character '©' is causing problem when it's included
in a XML file via XInclude. Even if it's a perfectly legal ISO-8859-1
entity.

More generally, the problem is encoding handling in «parse="text"» included
entities.

In the meantime, I'll avoid © in the included files.

Just encode them as UTF-8 (as it is the internal encoding of libxml), or
use XML files and XPointer...

The problem seems to be that when inluding a file with XInclude it
makes an assumption about the encoding (it ignores the encoding
attribute ?). Looks like the issue is known (excerpt from xinclude.c in
xmlXIncludeLoadTxt) :
    /*
     * Load it.
     * Issue 62: how to detect the encoding
     */
[...]
            /*
             * TODO: if the encoding issue is solved, scan UTF8 chars
instead
             */

As a suggestion for solving the issue: what about adding an 
«xmlCharEncoding enc» argument to xmlXIncludeLoadTxt, based on the encoding
attribute of the xi:include element.
This value is computed with xmlParseCharEncoding().

The attached patch uses this. Unfortunately, this doesn't solve the
problem: xmlParserInputBufferRead in xmlXIncludeLoadTxt returns 0 while it
actually has read 2 chars (compiled with -DDEBUG_INPUT -DDEBUG_ENCODING
-DDEBUG_XINCLUDE):
ptittom:~/gnome-xml$ ./xmllint --xinclude ../test.xml 2>&1
...
Found registered handler for encoding ISO-8859-1
converted 2 bytes to 3 bytes of input
I/O: read 0 chars, buffer 3/4000
closed the encoding handler
...

As I'm really not used to xmlIO stuff, I can't do further debugging...

Just a little note about the patch: I used a cast from xmlChar to const
char for the encoding value, as it is supposed to contain only plain ascii
chars; maybe I should have used UTF8Toisolat1() or something like that
(UTF8Toascii is declared static)...

HTH

Tom.

Attachment: xinclude-encoding.diff
Description: Text document



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]