Re: [xml] setting URL for xmlRelaxNGParserCtxt?

From: Martijn Faassen <faassen infrae com>
To: veillard redhat com
Cc: xml gnome org
Subject: Re: [xml] setting URL for xmlRelaxNGParserCtxt?
Date: Wed, 26 Jan 2005 12:42:57 +0100

Hey Daniel,

I hope you'll still answer the other part of my mail (the Relax NGinclude processing errors).. that's a bit more urgent right now. :)


Daniel Veillard wrote:

On Wed, Jan 26, 2005 at 11:35:52AM +0100, Martijn Faassen wrote:
One note though, I'm using dictionnaries more and more intensively those
days, so keeping a single dictionnary for everything, while it clearly
speed-up processing may grow quite a bit for example as you process random
documents. It might be a good idea to "sometimes" reset the dictionnary
to avoid explosion, for example for all the ID strings ever processedby the program.
I don't use a single dictionary to speed up processing. I am using asingle dictionary as I have absolutely no way to know which nodes fromwhich nodes are going to be combined. I actually don't know what thesedictionaries are used for (are elements stored in them? text nodestrings? both? and why? :), but having separate ones is just animpediment to moving nodes around freely. Moving nodes around freely isabsolutely essential in my library's case, as the API I'm implementingallows this always.
  okay then just keep the same dictionnary, but it may grow indefinitely
and may slow things down after a while.

So what is the dictionary used for right now? I don't want things toslow down after a while, obviously.

What do you mean by resetting for every ID string? You mean as soon asan ID is encountered in an XML document? This may never happen for manyclasses of XML document, right?
  IDs values are stored in the dictionnary, they are potentially random
strings, and hence may grow your dictionnary indefinitely even if the
documents you manipulate have a fixed vocabulary.

  If it doesn't make sense for you, just forget about it.

Well, no, I'd rather understand it so I can fix my code and *then*forget about this. I also am interested in dictionaries to see whetherthere's any smart Python unicode object caching possible.

Dictionaries won't grow indefinitely if I throw away nodes once everywhile, right? The refcount will go down of dictionary values, to 0 ifit's a unique string, and the thing (tag name? text node value?) will goaway.

Now understand that there are "ID"s which are potentially randomstrings, and will grow my dictionary. Will those IDs never be removedbecause they're not dereffed when a document goes away? Are these XMLper-document unique IDs we're talking about, special attributes whichtypically get declared to be an ID in a DTD? I'm asking as I haven'tdone a lot with IDs yet.


Regards,

Martijn

Follow-Ups:
- Re: [xml] setting URL for xmlRelaxNGParserCtxt?
  - From: Daniel Veillard
- Re: [xml] setting URL for xmlRelaxNGParserCtxt?
  - From: Kasimier Buchcik

References:
- [xml] setting URL for xmlRelaxNGParserCtxt?
  - From: Martijn Faassen
- Re: [xml] setting URL for xmlRelaxNGParserCtxt?
  - From: Daniel Veillard
- Re: [xml] setting URL for xmlRelaxNGParserCtxt?
  - From: Martijn Faassen
- Re: [xml] setting URL for xmlRelaxNGParserCtxt?
  - From: Daniel Veillard
- Re: [xml] setting URL for xmlRelaxNGParserCtxt?
  - From: Martijn Faassen
- Re: [xml] setting URL for xmlRelaxNGParserCtxt?
  - From: Daniel Veillard

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]