Re: [xml] redicting parts of trees


On Fri, 2005-05-20 at 16:59 +0200, Martijn Faassen wrote:
Kasimier Buchcik wrote:


Yeah, I read some of the message on your lxml list about your mechanism
to keep detached nodes alive if they are referenced by multiple wrapper
proxies. We took a sometimes memory-consuming but simple approach: we
never free any removed Libxml2 nodes from the document, they are moved
into an internal list of "garbage" nodes in the document wrapper and
freed when the document is freed. A "flush" method can be used to
cleanup such "garbage" nodes, if the user is sure that it's safe.

Right, since lxml aims at being as "Pythonic" as possible I don't want 
the user to worry about these issues at all. I think I've accomplished 
this fairly well, though I'm still mopping up bugs here and there once 
every while (plus some fundamental stuff I hope to solve for good when 
we have an adoptNode()) and I'm sure some performance issues could be 
improved somewhat still.

OK. We wanted to be it as quick as possible. The "flushGarbage" is
normally not called, so only after massive removals of nodes. Do you
handle XPath results as well? I.e. is the reference counter (if that's
what you do) increased for every node in such a list as well?


Yes, in your case, if single attributes are not expected to be adopted,
and potentially many auto-created namespace declarations don't bother
you, the mechanism of xmlReconciliateNs seems best fitting: it just
re-creates the missing declarations on the adopted element. OK, good to
know that!

Yes, indeed. I am a bit concerned the namespace declarations will be 
polluted somewhat when serializing, but I can live with that for now, as 
long as the infoset is still okay.

I'll try to store the ns-declarations on the doc, as I have Rob on this
side now. So we'll get less redundant ns-declarations.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]