Re: [xml] Potential wrong usage of xmlIsID() in tree.c
- From: Kasimier Buchcik <K Buchcik 4commerce de>
- To: Rob Richards <rrichards ctindustries net>
- Cc: ML-libxml2 <xml gnome org>
- Subject: Re: [xml] Potential wrong usage of xmlIsID() in tree.c
- Date: Tue, 21 Feb 2006 16:06:26 +0100
Hi,
On Tue, 2006-02-21 at 08:56 -0500, Rob Richards wrote:
Kasimier Buchcik wrote:
As far as copy/import/adopt goes. I have found a couple of conflicting
opinions for this. One is that the importing doc should determine if an
attribute is added as an ID. The other (which is also how it is
implemented in Xerces) is that an attribute should be copied/imported
with the same ID-ness as in the source document. The latter would be in
I agree with the latter mechanism. This looks like the only
sane/efficient mechanism to me currently (this is also the reason why
I'm not confident with automatic IDness detection).
I agree on the latter method as well. I do see some potential issues
with the automatic detection after thinking about it more. From a purely
DTD view, if the document is parsed without validation or attribute
handling enabled, the current implementation will still create IDs
modiying the tree. There is no indication in the document that
this was not desired. So the question comes to mind should IDs be
created or not?
Ah, good point. I think it should not automatically create IDs in this
case, since we don't know if the attribute's element is valid, and thus
if the element/attribute is valid at that position of the tree. So we
could run into creating IDness for attributes which wouldn't become IDs
if processed by the validator.
<snip />
retrieve by ID in the importing doc.
I also found out that all attributes require an element to be considered
an ID (even the manually set ones), so modified the code appropriately
For XML Schema it's not only the element, but the whole ancestor axis.
Currently we have no other way of querying if an attribute is of type ID
than validating the whole tree. So this mechanism would work for DTD
only.
Right. I haven't looked at the state of using other schemas to determine
ID-ness at this point so was just concentrating on DTDs where I know it
is implemented.
I tried to write down the changes and my observations of this patch:
1) xmlAddNextSibling():
a) I don't know if denying to add an attribute-sibling if @cur
is not attached to an element, will break something for someone.
It never even crossed my mind that someone might be building a list of
attributes without a corresponding element. Have no qualms about
removing that check.
I really dunno; the attribute handling is this function looks weird
anyway, so I could think of that one being used for weird use-cases.
But I have no preferences here, since I don't use it.
b) You noticed the potential chance here of freeing
the attribute (@cur) to which a sibling attribute (@elem) should be
attached if both have the same ns and local-name. We need to fix
that.
Yup, ran across this issue in all the sibling functions.
OK. What do you think about compiling this semantic and
also the addition/removal of IDs into one function to be called by the
various append-something-to-something functions? Or would it slow down
things too much?
c) Automatic IDness detection.
Right now I was looking at consistency. It already exists creating and
setting attributes, so however the discussion on automatic detection
True. Do you know when the already existing behaviour was added? If
it was done recently, then one could still bring forward the argument
that it was a bug to add that behaviour - hmm, or is it too late?
shakes out I think it should work in the same manner for related
functionality.
Yes.
2) xmlAddPrevSibling():
a) Similar to xmlAddNextSibling().
b) Tries to automatically convert the doc of @elem, if different.
I fear that trying to handle moving-to-different-docs related
semantics at the level of every tree-modifying function will
result in much scattered and potentially hard-to-maintain code.
I agree with this and personally handle this myself prior to calling the
functions, but the functionality for this has always been here so could
possibly break apps if removed. All I did here was move it around a bit.
Ah, OK. So it even needs to take care of moving the value to the
destination doc's dict if it was in the source doc's dict.
3) xmlAddSibling():
a) Same as 1)
4) xmlAddChild():
a) Same as 1c)
5) xmlUnlinkNode():
a) IDness removal on the basis of atype == XML_ATTRIBUTE_ID
There is no need to check xmlIsID as the attr cannot be an ID without an
element, so the extra check can be eliminated.
6) xmlReplaceNode():
a) Same as 1c)
b) Same as 5a)
7) xmlSetProp() and xmlSetNsProp():
a) Same as 5a)
About 2b):
I would be happy if we could agree on using the "adopt" semantics
prior to handing nodes over to tree-modyfing functions. Although
the xmlDOMWrapAdoptNode() function looks a bit scary at first
sight, it could constitute a centralized way of get rid of
doc-moving issues in every tiny function.
Agreed. This is one thing I have been meaning at looking at. The issue I
hit was allowing a node without a doc to be passed to the setTreeDoc
functionality with a doc using dictionaries, which resulted in the doc
having mixed usage. Not sure what the result is if setTreeDoc is called
where both docs use dicts. I do believe I saw some bigger problems with
this when I first looked at it (has been a low priority for me since I
don't allow doc adoption in these functions other than a node without a
doc and doing so resulted in minimal impact).
If you could send a tiny example of what didn't work with it, I would
fix it :-)
About automatic ID detection:
IIRC, for DOM, there's no such automatic behaviour defined.
This would probably fall under the category of:
"using mechanisms that are outside the scope of this specification, it
is then an externally-determined ID attribute."
Hmm, I don't think so, since for DTD, XML Schema and setting IDs via the
API, this is defined (directly above that specific sentence in the
spec). I think this is intended to be a fall-back rule for other
mechanisms.
<snip />
Additionally automatic IDness detection works for DTDs only, since
using XML Schema, there's no way to say for a element/attribute
combination if the attribute is of type xs:ID without validating the
_whole_ document.
If talking about efficiency wrt speed, then automatic IDness detection
is clearly not nice; I hope we don't go this way until there are
convincing arguments (specs?) to do so.
I hope we can still discuss the issue again, before implementing
IDness detection.
Speed is definitely a concern as doing this will definitely slow things
down a bit. I'm not looking to rush to get this implemented, but the cat
is already out of the bag since the functionality already exists when
creating, setting and copying props. Just glad that the proposed changes
got things moving along because if the existing behavior is to change I
would rather see it changed sooner than later.
Rob
OK. The first important thing for me to know is, if we can revert the
already existent automatic detection of IDness via xmlIsID() or not.
Thoughts?
Regards,
Kasimier
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]