Re: [xml] DTDs and null SystemID/ExternalID ?



On Fri, Jul 06, 2007 at 10:40:39PM -0400, Stefan Jeglinski wrote:
Setup for the question: it appears that there are 3 basic DOCTYPES 
for an xml document:

1:
<!DOCTYPE root_name SYSTEM "local_path_to_dtd">
<
...
(xml content)
...


  No it's not a local Path it's an URI Reference , the system identifier

2:
<!DOCTYPE root_name PUBLIC "url_to_dtd">
<
...
(xml content)
...


 No that's forbidden you MUST have a public identifier which is
an identifier before the system identifier itself

3:
<!DOCTYPE root_name [
...
(element definitions etc)
...
]>
<
...
(xml content)
...


  Yes that's legal, that's called an internal subset, and no it's not
the last third case as you can combine SYSTEM or PUBLIC identifiers and
exend their definitions with an internal subset too.

  Spec is there 
    http://www.w3.org/TR/REC-xml/#NT-doctypedecl
please don't paraphrase especially when being wrong. take the time to read
the spec it will help being understood and avoid this kind of errors.


I'm trying to do the 3rd one, where the dtd is self-contained within 
the xml file. I am doing it in memory, using 
xmlParserInputBufferCreate etc. I create a doc using xmlNewDoc, then 
the dtd from the buffer using xmlIOParseDTD, and then blow it out to 
disk with xmlSaveFormatFile. Works like a champ, except that...

  the ParseDTD functions are there to load an external subset, not to
try to tweak things like that. What you're attempting may work if you
understand the internal representation of libxml2 and be careful though.

I've boxed myself into a corner. Snooping the libxml2 source 
(xmlsave.c, xmlDtdDumpOutput), I see that I can create the 3rd type 
only if SystemID and ExternalID are NULL.

  Sorry, that premice sounds wrong to me. Please explain ! SystemID 
 and ExternalID where ?

But this is directly at 
odds with xmlIOParseDTD, which ultimately allocates both and fills 
them in with "none".

  I don't think I ever use "none", it may be NULL, and that's normal
it's not part of the file being parsed.

You can't naively can't set them to NULL after the fact, because you 
will die in xmlFreeDoc.

  I'm not sure I understand, explain !

Is there a way to create the dtd in memory so that both SystemID and 
ExternalID are NULL, giving me DOCTYPE #3? Or am I out in left field 
here? (I'm working my way through learning this stuff bit by bit, and 
frankly don't know any better...)

  First read the spec it will help understanding the code and the 
terminology. Second I don't understand what you're trying to achieve.
 "create the dtd in memory" how ? The DOM tree for the DTD ? The
serialization of a DTD to a string ? 

  Very confusing ...

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]