Re: [xml] Tools to look inside a DTD allows?



Hi Michael, Stuart and Bjoern.

On 02/ 6/11 12:32 PM, Michael Ludwig wrote:
Jack Schwartz schrieb am 03.12.2010 um 16:48 (-0800):

I'm looking for a documented way to look into a DTD [â], but haven't
seen one.
I haven't seen any either. Googling for the following didn't turn up
anything either:

   programmatic access dtd -html

MSDN said back in 2002:

   Most XML 1.0 processors support DTD validation, but they don't support
   programmatic access to the information found in the DTD due to the
   complexity of the syntax.

 From the April 2002 issue of MSDN Magazine
http://msdn.microsoft.com/en-us/library/bb986126.aspx

The closest I've come up with is dumping the DTD with the following
python program:
Looks simple enough, but I understand your concerns.

The output shows what I need: the names, order and quantity of
different children per parent element.  However, I am reliant on an
undocumented output format, which could change and which would break
my code if it did.

Another approach would be to just look at the original DTD,
searching for "<!ELEMENT" lines.

However, it would be better if I used a tool to get this information
if one existed.  (In my simple case, at least it would help with
skipping comments in the DTD file, etc.)  Does such a tool or API
exist in libxml2, lxml or somewhere else?
I think XML editors with DTD support must have a way of getting at the
information. At least if they want to assist you in editing.
I ended up parsing the DTD <!ELEMENT lines, plus <!ENTITY lines to facilitate reading in DTDs referenced by a main DTD. The parser ended up being about 700 heavily-commented lines of python, using regular expressions and recursion to peel away the nested parenthesized expressions of child elements.

The problem I was solving was doing overlay of one XML tree onto another. I needed to know where to insert an overlayed tree into the main one, so I needed to know the layout of an element's children.

I think the reason why I could not find XML overlay functions is that it is hard to always give a correct answer as to where to place the new data. For example, if I am building a tree using overlays, an element A (possibly the root of a subtree) may need to be placed before an element B under a common parent, but B may not have been added yet, so what to do? I'm not sure there is a better solution, but for now, I add it to the end of the parent's list of children. That way, if elements are added in the proper order, they will end up in the right place in the list, no matter what.

    Thanks,
    Jack

P.S. Code should be out for code review soon at
caiman-iscuss opensolaris org, or ping me if you would like more details.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]