Re: [xslt] Bug: standalone attribute not recognized in XML declaration



On Tue, Aug 25, 2009 at 06:00:23PM +0200, Michael Ludwig wrote:
> Alexander Pohoyda schrieb:
>
>> This perfectly correct XML file (according to xmllint):
>>
>>   <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
>>   <!DOCTYPE a PUBLIC "-//A//EN" "">
>>   <a />
>>
>> Cannot be processed by xsltproc with this oputput:
>>
>>   a.xml:1: parser error : parsing XML declaration: '?>' expected
>>    standalone="yes"?>
>>    ^
>
> I don't know whether or not this is a bug, but I can confirm that there
> is an error with the current version of LibXML2 and LibXSLT.
>
> milu colinux:~ > xsltproc --version
> Using libxml 20703, libxslt 10124 and libexslt 813
> xsltproc was compiled against libxml 20703, libxslt 10124 and libexslt 813
> libxslt 10124 was compiled against libxml 20703
> libexslt 813 was compiled against libxml 20703
>
> milu colinux:~ > xsltproc Werkstatt/xsl/identity.xsl /tmp/a.xml
> /tmp/a.xml:1: parser error : parsing XML declaration: '?>' expected
>  standalone="yes"?>
>  ^
> /tmp/a.xml:2: parser error : Content error in the external subset
> <!DOCTYPE a PUBLIC "-//A//EN" "">
> ^
> /tmp/a.xml:2: parser error : Content error in the external subset
> <!DOCTYPE a PUBLIC "-//A//EN" "">
> ^
> unable to parse /tmp/a.xml
>
>> If I remove the DOCTYPE line, no error happens.
>
> Same here.

  Haha, I nearly got fooled. You know what, it's perfectly normal.
Hint: try xmllint --valid tst.xml

  The DOCTYPE states: <!DOCTYPE a PUBLIC "-//A//EN" "">

it uses "" as the URI reference for the system identified. Take the
good old RFC describing URIs and you will find that "" as an
URI Reference means the current document.
So when one ask to load the DTD, which is the case for XSLT processing
libxml2 loads that document as the DTD.

  http://www.w3.org/TR/REC-xml/#dt-doctype

  Well-formedness constraint: External Subset
  The external subset, if any, MUST match the production for extSubset.

  [30]      extSubset      ::=       TextDecl? extSubsetDecl

  [77]      TextDecl       ::=      '<?xml' VersionInfo? EncodingDecl S? '?>'

et voila, libxml2 reports a fatal error when parsing the document as its
own DTD as the standalone information is forbidden there.

  If you remove the DOCTYPE, of course the document becomes well-formed.

It's on case of well-formedness error that non-validation parsers can't
catch because it affects the external subset, which they don't have to
load.

  NOTABUG !

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]