Re: Picky Parser and Doc Bugs
- From: Daniel Veillard <Daniel Veillard w3 org>
- To: gnome-doc-list gnome org, Daniel Veillard w3 org
- Subject: Re: Picky Parser and Doc Bugs
- Date: Fri, 21 Jul 2000 17:08:54 +0200
On Fri, Jul 21, 2000 at 01:41:38PM +0300, Ali Abdin wrote:
> Okay - while I was debugging gnome-db2html2, I discovered something.
> Apparently the parser (from libxml) is VERY picky :) If you find something
> like this:
> <!DOCTYPE Book PUBLIC "-//GNOME//DTD DocBook PNG Variant V1.0//EN" [] >
>
> then please change it to the following:
> <!DOCTYPE Book PUBLIC "-//GNOME//DTD DocBook PNG Variant V1.0//EN" []>
>
> (notice the extra space between ']' and '>' at the end)
>
> If you do not do this then xmlSAXUserParseFile returns an integer less than
> zero (an error) and thus gnome-db2html2 dies/aborts
>
> DV: Is the parser supposed to be this picky?
The parser is picky because it tries to be a conformant parser.
In tahtb case it's the following production from the XML spec:
http://www.xml.com/axml/target.html#NT-doctypedecl
(for those who don't know it's teh annotated version of the XML
spec, a must if you're fighting with XML and don't understand why
and how a given production of the specs doesn't work as you
would expect).
[28] doctypedecl ::= '<!DOCTYPE' S Name (S ExternalID)? S?
('[' (markupdecl | PEReference | S)* ']' S?)? '>'
in that case libxml should derivate the following:
'<!DOCTYPE' S Name (S ExternalID)? S? '[' ']' S? '>'
so spaces are allowed at the end and it's a bug ! you get a cookie !
------
*** parser.c 2000/07/14 14:11:40 1.208
--- parser.c 2000/07/21 15:07:18
*************** xmlParseInternalSubset(xmlParserCtxtPtr
*** 7344,7350 ****
break;
}
}
! if (RAW == ']') NEXT;
}
/*
--- 7344,7353 ----
break;
}
}
! if (RAW == ']') {
! NEXT;
! SKIP_BLANKS;
! }
}
/*
------
will be commited to CVS within 1/2 hour,
> DV: I also get the following: XML-CRITICAL **: SystemLiteral " or ' expected -
> I have no idea why this happens - can you fill us in?
Yes in that case I guess it's due to the fact that if you
do do a PUBLIC declaration, you have to also provide an url for the
resource:
[75] ExternalID ::= 'SYSTEM' S SystemLiteral |
'PUBLIC' S PubidLiteral S SystemLiteral
That's one of the differences between XML and SGML, if you reference
an external DTD you have to provide a SystemLitteral pointing to
your DtD. I don't know it's value for the XML Docbook version, you should
have something like:
<!DOCTYPE Book PUBLIC "-//GNOME//DTD DocBook PNG Variant V1.0//EN"
"http://www.docbook.org/png/1.0/docbook.dtd" [] >
Also if you don't need to include stuff in the internal subset just drop the
angle brackets:
<!DOCTYPE Book PUBLIC "-//GNOME//DTD DocBook PNG Variant V1.0//EN"
"http://www.docbook.org/png/1.0/docbook.dtd">
This looks like the right way (but fix the URL for docbook).
> Also - for all documentation I get the following:
> XML-CRITICAL **: xmlParseExternalID: PUBLIC, no URI
> (note: this is a non-fatal error - libxml gracefully handles it)
>
> Should we have the DTD's online with a URI pointing to them? I think we're
> supposed to (as specificed in the XML specifications) - so is it valid
> Docbook/XML? Shall we add it to the GDP webpage and update the docs?
C.f. the paragraph before.
> Another point - XML parser also complains if you have typo mismtaches between
> tags - i.e. I found in my doc (glife.sgml) that I opened with <Book> and
> closed with </book>. So if you find something like that please correct it
> (Note: this is a non-fatal-error - libxml can gracefully handle it)
huh, it should not ... I'm surprized ! This is a fatal error.
~/XML -> cat pubid.xml
<!DOCTYPE Book PUBLIC "-//GNOME//DTD DocBook PNG Variant V1.0//EN"
"http://www.docbook.org/png/1.0/docbook.dtd" [] >
<Book>
</book>
~/XML -> ./xmllint pubid.xml
pubid.xml:4: error: Opening and ending tag mismatch: Book and book
</book>
^
~/XML ->
> I found these bugs by turning on 'ERROR_OUTPUT' in gnome-db2html2 :)
>
> Uggh - also it appears that gnome-db2html2 does not support the 'menuchoice'
> or 'guisubmenu' tag - Uggh. I'll look into it.
thanks for the report,
Daniel
--
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes | Today's Bookmarks :
Tel : +33 476 615 257 | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207 | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
http://www.w3.org/People/all#veillard%40w3.org | RPM badminton Kaffe
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]