Re: [xml] Potential wrong usage of xmlIsID() in tree.c

From: Kasimier Buchcik <K Buchcik 4commerce de>
To: Rob Richards <rrichards ctindustries net>
Cc: ML-libxml2 <xml gnome org>
Subject: Re: [xml] Potential wrong usage of xmlIsID() in tree.c
Date: Mon, 20 Feb 2006 18:43:34 +0100

Hi Rob,

On Sun, 2006-02-19 at 07:14 -0500, Rob Richards wrote:

Kasimier Buchcik wrote:

Hi,

Bugzilla:
http://bugzilla.gnome.org/show_bug.cgi?id=331273

Regards,

Kasimier

Kasimier,

To give you an idea of how I think the IDs should be modified, take a 
look at this diff:
http://www.ctindustries.net/libxml/tree.c.diff.txt

It's still a work in progress, as I am also fixing up some of the other 
sibling functions. These aren't DOM compliant functions (DOM only works 
on children with these functions), but the library in general has always 
allowed attributes to be added using these functions (as they are


Yeah :-) Looks heavy-metal to have the attribute-related code in those
functions.

generic and not DOM specific), so I don't think they should prevent it 
now and probably should deal with IDs as well. There are a few potential 
crashes dealing with attribute additions in these which is what I'm 
currently still fixing.

As far as copy/import/adopt goes. I have found a couple of conflicting 
opinions for this. One is that the importing doc should determine if an 
attribute is added as an ID. The other (which is also how it is 
implemented in Xerces) is that an attribute should be copied/imported 
with the same ID-ness as in the source document. The latter would be in


I agree with the latter mechanism. This looks like the only
sane/efficient mechanism to me currently (this is also the reason why
I'm not confident with automatic IDness detection).

line with how XInclude works and imo would make it easy for a user to


I took a look at the XInclude spec, but it seems - to me - to make no
explicit mention of what happens with ID attributes. It does say what
to do with IDREF attributes though.

In the "Fragment Inclusion Example" of this spec, the included
fragment has some ID attributes, which are converted to non-ID
attributes in the resulting infoset. So I wonder if this example
is just broken, or is preserving IDness from XIncluded fragments
just a myth? Did I miss something?
http://www.w3.org/TR/xinclude/#fragment-example

retrieve by ID in the importing doc.

I also found out that all attributes require an element to be considered 
an ID (even the manually set ones), so modified the code appropriately


For XML Schema it's not only the element, but the whole ancestor axis.
Currently we have no other way of querying if an attribute is of type ID
than validating the whole tree. So this mechanism would work for DTD
only.

for this. In the event the copy/import goes the way of the importing doc 
determining ID-ness, the manually set IDs do copy over as IDs regardless 
of what the importing doc says, which would add a bit more complexity on 
top of the XInclude handling.

Thoughts?

Rob


I tried to write down the changes and my observations of this patch:

1) xmlAddNextSibling():
  a) I don't know if denying to add an attribute-sibling if @cur
    is not attached to an element, will break something for someone.
  
  b) You noticed the potential chance here of freeing
    the attribute (@cur) to which a sibling attribute (@elem) should be
    attached if both have the same ns and local-name. We need to fix
    that.
  c) Automatic IDness detection.

2) xmlAddPrevSibling():
  a) Similar to xmlAddNextSibling().
  b) Tries to automatically convert the doc of @elem, if different.
    I fear that trying to handle moving-to-different-docs related
    semantics at the level of every tree-modifying function will
    result in much scattered and potentially hard-to-maintain code.   

3) xmlAddSibling():
  a) Same as 1)

4) xmlAddChild():
  a) Same as 1c)

5) xmlUnlinkNode():
  a) IDness removal on the basis of atype == XML_ATTRIBUTE_ID

6) xmlReplaceNode():
  a) Same as 1c)
  b) Same as 5a)

7) xmlSetProp() and xmlSetNsProp():
  a) Same as 5a)


About 2b):
  I would be happy if we could agree on using the "adopt" semantics
  prior to handing nodes over to tree-modyfing functions. Although
  the xmlDOMWrapAdoptNode() function looks a bit scary at first
  sight, it could constitute a centralized way of get rid of
  doc-moving issues in every tiny function.

About automatic ID detection:
  IIRC, for DOM, there's no such automatic behaviour defined.

  The definition for Attr.isID() states (for XML Schema):
  "If validation occurred using an XML Schema [XML Schema Part 1] while
   loading the document or while invoking Document.normalizeDocument(),
   the post-schema-validation infoset contributions (PSVI contributions)
   values are used to determine if this attribute is a schema-determined
   ID attribute using the schema-determined ID definition in
   [XPointer]."

  I read this as a way for the user to request IDness being evaluated by
  performing a validation over the whole document.
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#Attr-isId

  If the user wants to add ID attributes then there are methods like
  Element.setIdAttribute().
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-ElSetIdAttr

  Additionally automatic IDness detection works for DTDs only, since
  using XML Schema, there's no way to say for a element/attribute
  combination if the attribute is of type xs:ID without validating the
  _whole_ document.

  If talking about efficiency wrt speed, then automatic IDness detection
  is clearly not nice; I hope we don't go this way until there are 
  convincing arguments (specs?) to do so.

  I hope we can still discuss the issue again, before implementing
  IDness detection.

Regards,

Kasimier

Follow-Ups:
- Re: [xml] Potential wrong usage of xmlIsID() in tree.c
  - From: Daniel Veillard
- Re: [xml] Potential wrong usage of xmlIsID() in tree.c
  - From: Rob Richards

References:
- [xml] Potential wrong usage of xmlIsID() in tree.c
  - From: Kasimier Buchcik
- Re: [xml] Potential wrong usage of xmlIsID() in tree.c
  - From: Daniel Veillard
- Re: [xml] Potential wrong usage of xmlIsID() in tree.c
  - From: Kasimier Buchcik
- Re: [xml] Potential wrong usage of xmlIsID() in tree.c
  - From: Daniel Veillard
- Re: [xml] Potential wrong usage of xmlIsID() in tree.c
  - From: Kasimier Buchcik
- Re: [xml] Potential wrong usage of xmlIsID() in tree.c
  - From: Rob Richards
- Re: [xml] Potential wrong usage of xmlIsID() in tree.c
  - From: Kasimier Buchcik
- Re: [xml] Potential wrong usage of xmlIsID() in tree.c
  - From: Rob Richards

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]