[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
"Re: [xml] DTD - external subset - encoding"
- From: Kasimier Buchcik <kbuchcik 4commerce de>
- To: <xml gnome org>
- Subject: "Re: [xml] DTD - external subset - encoding"
- Date: Mon, 23 Feb 2004 22:11:33 +0100
Hi,
on 2/23/2004 9:54 PM Daniel Veillard wrote:
> On Mon, Feb 23, 2004 at 08:21:35PM +0100, Kasimier Buchcik wrote:
>
>>Hihi,
>>
>>on 2/23/2004 8:08 PM Kasimier Buchcik wrote:
>>
>>
>>>Hi,
>>>
>>>on 2/23/2004 1:44 PM Kasimier Buchcik wrote:
>>>
>>>
>>>
>>>>Hi,
>>>>
>>>>I have a XML document that references an external subset. Both are
>>>>encoded in UTF-16. Xmllint seems to choke on the external subset file.
>>>>I haven't found anything about encoding problems with external subsets
>>>>in the mail archives & bug list.
>>>>
>>>>
>>>>C:\dev\libxml2\lib\xml-2-6-6-xslt-1-1-2>xmllint parament_test.xml
>>>>--valid --noent
>>>>parament_test.dtd:1: parser error : internal error
>>>>?<
>>>>^
>>>>parament_test.dtd:1: parser error : DOCTYPE improperly terminated
>>>>?<
>>>>^
>>>>parament_test.dtd:1: parser error : Input is not proper UTF-8, indicate
>>>>encoding !
>>>>?<
>>>>^
>>>>parament_test.dtd:1: error: Bytes: 0xFF 0xFE 0x3C 0x00
>>>>?<
>>>>^
>>>>parament_test.dtd:1: parser error : Start tag expected, '<' not found
>>>>?<
>>>> ^
>>>>
>>>>C:\dev\libxml2\lib\xml-2-6-6-xslt-1-1-2>xmllint --version
>>>>xmllint: using libxml version 20606
>>>> compiled with: DTDValid FTP HTTP HTML C14N Catalog XPath XPointer
>>>>XInclude Iconv Unicode Regexps Automata Schemas
>>>>
>>>>I'm working on a w2k machine.
>>>>I have attached the test used.
>>>
>>>
>>>After some debugging I stranded in "xmlParserHandlePEReference" where a
>>>switch of encoding could be done; but the test fails, since "entity->
>>>length" seems to be zero. I also have expected the "input" to be tested
>>>for length >= 4 and not the "entity" - but this one is zero as well.
>>>Since I don't know what length to use or how to fix this, I'm just able
>>>to point this out:
>>>
>>>
>>>parser.c (xmlParserHandlePEReference)
>>>
>>>/*
>>> * Get the 4 first bytes and decode the charset
>>> * if enc != XML_CHAR_ENCODING_NONE
>>> * plug some encoding conversion routines.
>>> */
>>> GROW
>>> if (entity->length >= 4) { <<<----- HERE
>>> start[0] = RAW;
>>> start[1] = NXT(1);
>>> start[2] = NXT(2);
>>> start[3] = NXT(3);
>>> enc = xmlDetectCharEncoding(start, 4);
>>> if (enc != XML_CHAR_ENCODING_NONE) {
>>> xmlSwitchEncoding(ctxt, enc);
>>> }
>>> }
>>>
>>
>>Maby the line should read "if ((input->end - input->cur) >= 4)"?
>>Actually "input->length" seems not to be computed.
>
>
> looks like a bug, can you bugzilla this so this doesn't get lost ?
Done. Bug #135229.
Greetings,
Kasimier
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]