[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [xml] setting URL for xmlRelaxNGParserCtxt?



On Wed, Jan 26, 2005 at 05:39:56PM +0100, Martijn Faassen wrote:
> Daniel Veillard wrote:
> >On Wed, Jan 26, 2005 at 03:12:52PM +0100, Martijn Faassen wrote:
> >
> >>So what *is* stored in these dictionaries? I still don't know. Tagnames? 
> >>Namespace strings? Text node content? IDs? All of them? I guess I'll 
> >>have to study the source to get the answer. :)
> >
> >
> >  markup tag name, very small text node values, ID/REFs, DTD attribute
> >defaults values, namespace names. With libxslt you also get stylesheets
> >names.
> >  general text node content is not added, this would explode and be 
> >  unusable.
> 
> Okay, thanks. Even if that memory is not freed ever it isn't too bad. I 
> think I understand also now why you mention IDs, as they may be globally 
> unique strings and there might be many of them. Does namespace names 
> mean their prefixes or the href, or both?

  both,

> It might be interesting for me to try building something on top of the 
> dictionary that that caches Python unicode strings so that they don't 
> need to be regenerated all the time. Basically, if I understand it 
> correctly, dictionaries guarantee that there is only a single char* 
> pointer to a piece of textual data, so I could use that pointer as a 

  yes unicity of the pointer returned by the API is the main garantee.
(note that ptr+1 may not be unique as "boo" and "foo" will be stored on
different locations).

> >  they will point to freed memory. So don't free the dictionnary until
> >it it not in use anymore. Use another one, but you will loose unicity
> >of strings.
> 
> Hm, that sounds tricky. If I have a bunch of documents that share the 
> same dictionary, how would I go ahead and clean a dictionary up? One way 
> would be to hunt all references to dictionaries and replace the 
> dictionary with another one. The other way would be to clean or shrink 
> the dictionary itself.

  You can remove the dictionnary only when no more document reference it.
trying tou change dynamically the dictionnary of a document would be expensive
and very tricky,

> Both approaches have a problem I can't seem to figure my way out of:

  So don't do it.

> The strings in the original dictionary (or the strings not known to the 
> dictionary anymore if the dictionary has been 'shrunk') will still be 

  You can't 'shrunk' a dictionnary, there is no way you can tell whether
a given string need to be kept or discarded.
  But you can ask a dictionnary if it owns a string pointer really fast.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]