Re: Picky Parser and Doc Bugs



On Fri, Jul 21, 2000 at 01:41:38PM +0300, Ali Abdin wrote:
> Okay - while I was debugging gnome-db2html2, I discovered something.
> Apparently the parser (from libxml) is VERY picky :) If you find something
> like this:
> <!DOCTYPE Book PUBLIC "-//GNOME//DTD DocBook PNG Variant V1.0//EN" [] >
> 
> then please change it to the following:
> <!DOCTYPE Book PUBLIC "-//GNOME//DTD DocBook PNG Variant V1.0//EN" []>
> 
> (notice the extra space between ']' and '>' at the end)
> 
> If you do not do this then xmlSAXUserParseFile returns an integer less than
> zero (an error) and thus gnome-db2html2 dies/aborts
> 
> DV: Is the parser supposed to be this picky? 


   The parser is picky because it tries to be a conformant parser.
In tahtb case it's the following production from the XML spec:
    http://www.xml.com/axml/target.html#NT-doctypedecl
(for those who don't know it's teh annotated version of the XML
 spec, a must if you're fighting with XML and don't understand why
 and how a given production of the specs doesn't  work as you
 would expect).

[28] doctypedecl ::= '<!DOCTYPE' S Name (S ExternalID)? S?
                      ('[' (markupdecl | PEReference | S)* ']' S?)? '>'

 in that case libxml should derivate the following:
   '<!DOCTYPE' S Name (S ExternalID)? S? '[' ']' S? '>'

so spaces are allowed at the end and it's a bug ! you get a cookie !

------
*** parser.c	2000/07/14 14:11:40	1.208
--- parser.c	2000/07/21 15:07:18
*************** xmlParseInternalSubset(xmlParserCtxtPtr 
*** 7344,7350 ****
  		break;
  	    }
  	}
! 	if (RAW == ']') NEXT;
      }
  
      /*
--- 7344,7353 ----
  		break;
  	    }
  	}
! 	if (RAW == ']') { 
! 	    NEXT;
! 	    SKIP_BLANKS;
! 	}
      }
  
      /*
------

  will be commited to CVS within 1/2 hour,

> DV: I also get the following: XML-CRITICAL **: SystemLiteral " or ' expected -
> I have no idea why this happens - can you fill us in?

   Yes in that case I guess it's due to the fact that if you
do do a PUBLIC declaration, you have to also provide an url for the
resource:

[75] ExternalID ::= 'SYSTEM' S SystemLiteral | 
                    'PUBLIC' S PubidLiteral S SystemLiteral 

  That's one of the differences between XML and SGML, if you reference
an external DTD you have to provide a SystemLitteral pointing to
your DtD. I don't know it's value for the XML Docbook version, you should
have something like:

<!DOCTYPE Book PUBLIC "-//GNOME//DTD DocBook PNG Variant V1.0//EN" 
                      "http://www.docbook.org/png/1.0/docbook.dtd" [] >

Also if you don't need to include  stuff in the internal subset just drop the
angle brackets:

<!DOCTYPE Book PUBLIC "-//GNOME//DTD DocBook PNG Variant V1.0//EN" 
                      "http://www.docbook.org/png/1.0/docbook.dtd">

  This looks like the right way (but fix the URL for docbook).


> Also - for all documentation I get the following:
> XML-CRITICAL **: xmlParseExternalID: PUBLIC, no URI
> (note: this is a non-fatal error - libxml gracefully handles it)
> 
> Should we have the DTD's online with a URI pointing to them? I think we're
> supposed to (as specificed in the XML specifications) - so is it valid
> Docbook/XML? Shall we add it to the GDP webpage and update the docs?

  C.f. the paragraph before.

> Another point - XML parser also complains if you have typo mismtaches between
> tags - i.e. I found in my doc (glife.sgml) that I opened with <Book> and
> closed with </book>. So if you find something like that please correct it
> (Note: this is a non-fatal-error - libxml can gracefully handle it)

   huh, it should not ... I'm surprized ! This is a fatal error.

~/XML -> cat pubid.xml 
<!DOCTYPE Book PUBLIC "-//GNOME//DTD DocBook PNG Variant V1.0//EN" 
                      "http://www.docbook.org/png/1.0/docbook.dtd" [] >
<Book>
</book>
~/XML -> ./xmllint pubid.xml 
pubid.xml:4: error: Opening and ending tag mismatch: Book and book
</book>
      ^
~/XML -> 

> I found these bugs by turning on 'ERROR_OUTPUT' in gnome-db2html2 :) 
> 
> Uggh - also it appears that gnome-db2html2 does not support the 'menuchoice'
> or 'guisubmenu' tag - Uggh. I'll look into it.

   thanks for the report,

Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]