Re: [xml] Questions about xml:id

From: Daniel Veillard <veillard redhat com>
To: Rob Richards <rrichards ctindustries net>
Cc: xml gnome org, Bruce Miller <bruce miller nist gov>
Subject: Re: [xml] Questions about xml:id
Date: Mon, 21 Mar 2005 12:15:05 -0500

On Mon, Mar 21, 2005 at 11:21:52AM -0500, Rob Richards wrote:

Funny this topic came up as I just recently was looking at the id stuff 
for the PHP DOM extension. Have a few questions/comments as well as a 
few issues regarding this.

Daniel Veillard wrote:

It seems that libxml2 only gives an attribute that magic ID-ness
if the attribute's local-name is "xml:id"
[in perl:  $node->setAttribute('xml:id','someid'); ]
rather than an "id" attribute in xml namespace
[in perl: 
$node->setAttributeNS('http://www.w3.org/XML/1998/namespace','id','someid'); ]
Is this the intention?


Well the focus so far has mostly been toward parsing xml:id and handling
it only from a serialization. I didn't really looked at it from an API
perspective. But the behaviour exposed sounds like a bug.

Should the ID really be set with setAttribute/NS functions or should it 
only be only set (when manually building a tree that is) using one of 
the setIdAttribute methods from core 3 - meaning it would be up to the 
developer to call xmlAddID?


  I think the xml*Prop* API should take care of maintaining the ID table
integrity in the presence of a DTD associated to the document or if the
attribute is xml:id , i.e. current behaviour is largely buggy.

it's not complete probably. Removal of the attribute generate removal
from the ID table, but modification are not tracked down apparently.
Sounds like a bug too.

This is where I started running into issues. both xmlIsID and 
xmlFreeProp only handle IDs if the doc has intSubset or extSubset. Using 
xml:id or implementing setIdAttribute does not require subsets, so if 
the attribute gets freed along the way, the ID never gets removed from 
the table and calling xmlGetID returns invalid data.


  Good point, the check should really be about the presence of an
ID table associated to the document in case of mutation and the test
for int/extSubset for creation/deletion should probably be changed too
other appropriate tests.

I can understand wanting to do that type of check as to not have to 
check every attribute when being free for performance reasons, so was 
wondering wether the attribute atype could also be added so that either 
the doc has a subset or atype==XML_ATTRIBUTE_ID. This would also require


  checking the presence of doc->ids / doc->refs might just work.

resetting the atype in xmlRemoveID as it gets set in xmlAddID but never 
reset when removed.


  yes, atype should be maintained too.

Next in xmlFreeProp is the check for attribute parent. The attribute 
itself can get unlinked and freed but xmlFreeProp only checks for an ID 
if the attribute has a parent. I'm not sure wether the the attribute 
remains and ID if unlinked, but if it doesn't, then xmlUnlinkNode should 
remove the ID. If it still remains an ID then xmlFreeProp should not 
require a parent node when checking if attribute is an ID to be removed.


  need to be extremely cautious about that part of the code because it
affects streaming DTD validation for ID/REF(S) checks, even when the node
have been removed. 
  But in general the code need fixing for correct ID/xml:id support when
mutating trees. Not sure I will have time looking at it now, but if you're
interested I would take patches of course !

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Follow-Ups:
- Re: [xml] Questions about xml:id
  - From: Rob Richards

References:
- [xml] Questions about xml:id
  - From: Bruce Miller
- Re: [xml] Questions about xml:id
  - From: Daniel Veillard
- Re: [xml] Questions about xml:id
  - From: Rob Richards

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]