Re: [xml] Deep-copy of xsltStylesheetPtr?



On Sat, Sep 18, 2004 at 07:53:12AM -0700, Rasmus Lerdorf wrote:
On Sat, 18 Sep 2004, Daniel Veillard wrote:
The dictionnary unification is also used for string interning of the
processed document, which allows to use pointer comparison for a lot of
internal operation when  the transformation occurs instead of doing
string content compare.
It seems you are using memory pools (the xmlMemSetup() trick) I think
this makes copy even harder, it becomes impossible to reuse the original
dictionnary and just increment its use count. Seems in such a situation
the best is to copy the initial document and recompile the stylesheet.

A few more details on what I am doing.  By using xmlMemSetup()
and xmlCopyDoc() I am providing a way for PHP users to store an xmlDoc
struct in shared memory to be shared across Apache/PHP processes.  You get
some decent performance gains by only parsing XML files for an application
once instead of doing it on every single request.

  Okay, I was expecting something related to sharing in Apache, but
I have a clearer idea now.

Now I am trying to do the same for XSL transforms.  A very common thing
people do is fetch XML from some source and apply a stylesheet.  The
stylesheet rarely changes so I figured having it cached and ready to apply
without having to recompile it from disk on every request would speed
things up.  I obviously have no idea if it would be faster yet, and if you
tell me it wouldn't be I guess I should believe you.  ;)

  Is your shared memory area permanent ? I think if you can 1/ load/parse
the stylesheets XML in shared memory to a tree 2/ compile the stylesheets
also in shared memory, and if that shared memory is adressable directly
by the client processes then yes this is feasible and is a gain. But trying
to copy from the shared memory the preparsed form to duplicate in another
part of clients memory, that would probably be more complex and hard to 
achieve to provide any significant gain. Note however that once compiled,
stylesheets are read-only for all further instance transformation processing,
you if you can point directly to the shared memory, you can use the 
stylesheet pointer even if that memory is read-only (the really fun part
of it is that instances to be transformed may reuse the stylesheet
dictionnary to speed up processing further, but that dictionnary stay
read-only by allocating new strings in a intermediate disctionnary).

If the stylesheet compile really is fast then I can just cache the raw
stylesheet document in shared memory as you suggested and recompile it
when I need it.  Is there perhaps some middle stage I could cache though?
Like reduce it to a set of tokens and store those tokens to speed up the
recompile?  Or store any other sort of hint that could in some way speed
up the recompile?

  I think you should try to keep the compiled stylesheet in the shared
memory and if possible adress it directly from the child processes if
possible, that would be optimal I think. What would prevent sharing that
way ?

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]