Re: [xml] redicting parts of trees

From: Martijn Faassen <faassen infrae com>
To: Kasimier Buchcik <kbuchcik 4commerce de>
Cc: "xml gnome org" <xml gnome org>
Subject: Re: [xml] redicting parts of trees
Date: Fri, 20 May 2005 16:59:36 +0200

Kasimier Buchcik wrote:

Hi,

On Thu, 2005-05-19 at 20:19 +0200, Martijn Faassen wrote:

Kasimier Buchcik wrote:

Hi,

On Thu, 2005-05-19 at 17:16 +0200, Martijn Faassen wrote:

Kasimier Buchcik wrote:



[...]

Anyway, anything I can do now to help? I will of course be testing thisfacility at some stage within lxml, and give feedback then if necessary.
You could describe how you intend to manage namespaces in your
wrapper. Will you try to go W3C way or Libxml2 namespace way?
I'm following the ElementTree way, which uses Clarke notation. I.e. thewrapper shows namespace URIs directly as part of element names and such,like this:
{http://namespaces.somewhere.org/ns1}foo
and prefixes are, for now, completely ignored as not relevant to the XMLinfoset.

Ah.

Both have pros and cons. The relevant drawback in Libxml2 way
is that it's hard, if even not possible, to implement a DOM wrapper
which uses a programming language, where the time of destruction
of an object lies not within the control of the programmer.

Thanks, this is interesting as this is exactly what I'm trying to dowith lxml.



Yeah, I read some of the message on your lxml list about your mechanism
to keep detached nodes alive if they are referenced by multiple wrapper
proxies. We took a sometimes memory-consuming but simple approach: we
never free any removed Libxml2 nodes from the document, they are moved
into an internal list of "garbage" nodes in the document wrapper and
freed when the document is freed. A "flush" method can be used to
cleanup such "garbage" nodes, if the user is sure that it's safe.

Right, since lxml aims at being as "Pythonic" as possible I don't wantthe user to worry about these issues at all. I think I've accomplishedthis fairly well, though I'm still mopping up bugs here and there onceevery while (plus some fundamental stuff I hope to solve for good whenwe have an adoptNode()) and I'm sure some performance issues could beimproved somewhat still.


[snip]

I suspect that adoptNode() recreating namespaces wherever necessary inthe new document would indeed be sufficient to support Clarke notationin ElementTree, even though the XML serialization would look ugly.. Am Icorrect in that an adoptNode() would take care of this issue if prefixesare hidden from the API user's view?
Yes, in your case, if single attributes are not expected to be adopted,
and potentially many auto-created namespace declarations don't bother
you, the mechanism of xmlReconciliateNs seems best fitting: it just
re-creates the missing declarations on the adopted element. OK, good to
know that!

Yes, indeed. I am a bit concerned the namespace declarations will bepolluted somewhat when serializing, but I can live with that for now, aslong as the infoset is still okay.


Regards,

Martijn

Follow-Ups:
- Re: [xml] redicting parts of trees
  - From: Kasimier Buchcik

References:
- Re: [xml] redicting parts of trees
  - From: Daniel Veillard
- Re: [xml] redicting parts of trees
  - From: cazic
- Re: [xml] redicting parts of trees
  - From: Kasimier Buchcik
- Re: [xml] redicting parts of trees
  - From: Daniel Veillard
- Re: [xml] redicting parts of trees
  - From: Kasimier Buchcik
- Re: [xml] redicting parts of trees
  - From: Martijn Faassen
- Re: [xml] redicting parts of trees
  - From: Kasimier Buchcik
- Re: [xml] redicting parts of trees
  - From: Martijn Faassen
- Re: [xml] redicting parts of trees
  - From: Kasimier Buchcik

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]