[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: "Re: [xml] DTD - external subset - encoding"
- From: Daniel Veillard <veillard redhat com>
- To: Kasimier Buchcik <kbuchcik 4commerce de>
- Cc: xml gnome org
- Subject: Re: "Re: [xml] DTD - external subset - encoding"
- Date: Mon, 23 Feb 2004 15:54:01 -0500
On Mon, Feb 23, 2004 at 08:21:35PM +0100, Kasimier Buchcik wrote:
> Hihi,
>
> on 2/23/2004 8:08 PM Kasimier Buchcik wrote:
>
> > Hi,
> >
> > on 2/23/2004 1:44 PM Kasimier Buchcik wrote:
> >
> >
> >>Hi,
> >>
> >>I have a XML document that references an external subset. Both are
> >>encoded in UTF-16. Xmllint seems to choke on the external subset file.
> >>I haven't found anything about encoding problems with external subsets
> >>in the mail archives & bug list.
> >>
> >>
> >>C:\dev\libxml2\lib\xml-2-6-6-xslt-1-1-2>xmllint parament_test.xml
> >>--valid --noent
> >>parament_test.dtd:1: parser error : internal error
> >> ?<
> >>^
> >>parament_test.dtd:1: parser error : DOCTYPE improperly terminated
> >> ?<
> >>^
> >>parament_test.dtd:1: parser error : Input is not proper UTF-8, indicate
> >>encoding !
> >> ?<
> >>^
> >>parament_test.dtd:1: error: Bytes: 0xFF 0xFE 0x3C 0x00
> >> ?<
> >>^
> >>parament_test.dtd:1: parser error : Start tag expected, '<' not found
> >> ?<
> >> ^
> >>
> >>C:\dev\libxml2\lib\xml-2-6-6-xslt-1-1-2>xmllint --version
> >>xmllint: using libxml version 20606
> >> compiled with: DTDValid FTP HTTP HTML C14N Catalog XPath XPointer
> >>XInclude Iconv Unicode Regexps Automata Schemas
> >>
> >>I'm working on a w2k machine.
> >>I have attached the test used.
> >
> >
> > After some debugging I stranded in "xmlParserHandlePEReference" where a
> > switch of encoding could be done; but the test fails, since "entity->
> > length" seems to be zero. I also have expected the "input" to be tested
> > for length >= 4 and not the "entity" - but this one is zero as well.
> > Since I don't know what length to use or how to fix this, I'm just able
> > to point this out:
> >
> >
> > parser.c (xmlParserHandlePEReference)
> >
> > /*
> > * Get the 4 first bytes and decode the charset
> > * if enc != XML_CHAR_ENCODING_NONE
> > * plug some encoding conversion routines.
> > */
> > GROW
> > if (entity->length >= 4) { <<<----- HERE
> > start[0] = RAW;
> > start[1] = NXT(1);
> > start[2] = NXT(2);
> > start[3] = NXT(3);
> > enc = xmlDetectCharEncoding(start, 4);
> > if (enc != XML_CHAR_ENCODING_NONE) {
> > xmlSwitchEncoding(ctxt, enc);
> > }
> > }
> >
>
> Maby the line should read "if ((input->end - input->cur) >= 4)"?
> Actually "input->length" seems not to be computed.
looks like a bug, can you bugzilla this so this doesn't get lost ?
thanks,
Daniel
--
Daniel Veillard | Red Hat Network https://rhn.redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]