Re: [xml] setting URL for xmlRelaxNGParserCtxt?
- From: Daniel Veillard <veillard redhat com>
- To: Martijn Faassen <faassen infrae com>
- Cc: Kasimier Buchcik <kbuchcik 4commerce de>, xml gnome org
- Subject: Re: [xml] setting URL for xmlRelaxNGParserCtxt?
- Date: Wed, 26 Jan 2005 12:20:27 -0500
On Wed, Jan 26, 2005 at 05:39:56PM +0100, Martijn Faassen wrote:
Daniel Veillard wrote:
On Wed, Jan 26, 2005 at 03:12:52PM +0100, Martijn Faassen wrote:
So what *is* stored in these dictionaries? I still don't know. Tagnames?
Namespace strings? Text node content? IDs? All of them? I guess I'll
have to study the source to get the answer. :)
markup tag name, very small text node values, ID/REFs, DTD attribute
defaults values, namespace names. With libxslt you also get stylesheets
names.
general text node content is not added, this would explode and be
unusable.
Okay, thanks. Even if that memory is not freed ever it isn't too bad. I
think I understand also now why you mention IDs, as they may be globally
unique strings and there might be many of them. Does namespace names
mean their prefixes or the href, or both?
both,
It might be interesting for me to try building something on top of the
dictionary that that caches Python unicode strings so that they don't
need to be regenerated all the time. Basically, if I understand it
correctly, dictionaries guarantee that there is only a single char*
pointer to a piece of textual data, so I could use that pointer as a
yes unicity of the pointer returned by the API is the main garantee.
(note that ptr+1 may not be unique as "boo" and "foo" will be stored on
different locations).
they will point to freed memory. So don't free the dictionnary until
it it not in use anymore. Use another one, but you will loose unicity
of strings.
Hm, that sounds tricky. If I have a bunch of documents that share the
same dictionary, how would I go ahead and clean a dictionary up? One way
would be to hunt all references to dictionaries and replace the
dictionary with another one. The other way would be to clean or shrink
the dictionary itself.
You can remove the dictionnary only when no more document reference it.
trying tou change dynamically the dictionnary of a document would be expensive
and very tricky,
Both approaches have a problem I can't seem to figure my way out of:
So don't do it.
The strings in the original dictionary (or the strings not known to the
dictionary anymore if the dictionary has been 'shrunk') will still be
You can't 'shrunk' a dictionnary, there is no way you can tell whether
a given string need to be kept or discarded.
But you can ask a dictionnary if it owns a string pointer really fast.
Daniel
--
Daniel Veillard | Red Hat Desktop team http://redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]