Re: [xml] Potential wrong usage of xmlIsID() in tree.c



Kasimier Buchcik wrote:
As far as copy/import/adopt goes. I have found a couple of conflicting opinions for this. One is that the importing doc should determine if an attribute is added as an ID. The other (which is also how it is implemented in Xerces) is that an attribute should be copied/imported with the same ID-ness as in the source document. The latter would be in

I agree with the latter mechanism. This looks like the only
sane/efficient mechanism to me currently (this is also the reason why
I'm not confident with automatic IDness detection).
I agree on the latter method as well. I do see some potential issues with the automatic detection after thinking about it more. From a purely DTD view, if the document is parsed without validation or attribute handling enabled, the current implementation will still create IDs modiying the tree. There is no indication in the document that this was not desired. So the question comes to mind should IDs be created or not?

<snip />
retrieve by ID in the importing doc.

I also found out that all attributes require an element to be considered an ID (even the manually set ones), so modified the code appropriately

For XML Schema it's not only the element, but the whole ancestor axis.
Currently we have no other way of querying if an attribute is of type ID
than validating the whole tree. So this mechanism would work for DTD
only.
Right. I haven't looked at the state of using other schemas to determine ID-ness at this point so was just concentrating on DTDs where I know it is implemented.

I tried to write down the changes and my observations of this patch:

1) xmlAddNextSibling():
  a) I don't know if denying to add an attribute-sibling if @cur
    is not attached to an element, will break something for someone.
It never even crossed my mind that someone might be building a list of attributes without a corresponding element. Have no qualms about removing that check.
b) You noticed the potential chance here of freeing
    the attribute (@cur) to which a sibling attribute (@elem) should be
    attached if both have the same ns and local-name. We need to fix
    that.
Yup, ran across this issue in all the sibling functions.
  c) Automatic IDness detection.
Right now I was looking at consistency. It already exists creating and setting attributes, so however the discussion on automatic detection shakes out I think it should work in the same manner for related functionality.
2) xmlAddPrevSibling():
  a) Similar to xmlAddNextSibling().
  b) Tries to automatically convert the doc of @elem, if different.
    I fear that trying to handle moving-to-different-docs related
    semantics at the level of every tree-modifying function will
    result in much scattered and potentially hard-to-maintain code.
I agree with this and personally handle this myself prior to calling the functions, but the functionality for this has always been here so could possibly break apps if removed. All I did here was move it around a bit.
3) xmlAddSibling():
  a) Same as 1)

4) xmlAddChild():
  a) Same as 1c)

5) xmlUnlinkNode():
  a) IDness removal on the basis of atype == XML_ATTRIBUTE_ID
There is no need to check xmlIsID as the attr cannot be an ID without an element, so the extra check can be eliminated.
6) xmlReplaceNode():
  a) Same as 1c)
  b) Same as 5a)

7) xmlSetProp() and xmlSetNsProp():
  a) Same as 5a)


About 2b):
  I would be happy if we could agree on using the "adopt" semantics
  prior to handing nodes over to tree-modyfing functions. Although
  the xmlDOMWrapAdoptNode() function looks a bit scary at first
  sight, it could constitute a centralized way of get rid of
  doc-moving issues in every tiny function.
Agreed. This is one thing I have been meaning at looking at. The issue I hit was allowing a node without a doc to be passed to the setTreeDoc functionality with a doc using dictionaries, which resulted in the doc having mixed usage. Not sure what the result is if setTreeDoc is called where both docs use dicts. I do believe I saw some bigger problems with this when I first looked at it (has been a low priority for me since I don't allow doc adoption in these functions other than a node without a doc and doing so resulted in minimal impact).
About automatic ID detection:
  IIRC, for DOM, there's no such automatic behaviour defined.
This would probably fall under the category of:
"using mechanisms that are outside the scope of this specification, it is then an externally-determined ID attribute."

<snip />
  Additionally automatic IDness detection works for DTDs only, since
  using XML Schema, there's no way to say for a element/attribute
  combination if the attribute is of type xs:ID without validating the
  _whole_ document.

  If talking about efficiency wrt speed, then automatic IDness detection
is clearly not nice; I hope we don't go this way until there are convincing arguments (specs?) to do so.

  I hope we can still discuss the issue again, before implementing
  IDness detection.
Speed is definitely a concern as doing this will definitely slow things down a bit. I'm not looking to rush to get this implemented, but the cat is already out of the bag since the functionality already exists when creating, setting and copying props. Just glad that the proposed changes got things moving along because if the existing behavior is to change I would rather see it changed sooner than later.

Rob



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]