Re: [xslt] libxslt processor dump/load



On Thu, Apr 26, 2007 at 06:36:19PM +0930, Andrew Mason wrote:
> Aside from reducing the  bulk of the XSLT, is there anything that can be done 
> from an XSL stylesheet POV or from a libxslt POV to reduce the  overhead  of 
> importing the stylesheet?

  Do you *really* need a 174 KByte stylesheet. I hope you understand
this is insanely large. The compiler cannot optimize based on what you're
gonna give to it, so basically it will have to compiler everything even if
2% is actually used by the tags embedded in your document.
  So really the #1 thing to do is reduce your stylesheet to match the
actual processing need. And remember, XSLT is really not a good general 
pupose programming language it is only good at XML tree transformation
and trying to use it for other kind of processing is just well not the
right tool.

> We've spent the last 2 weeks going over various solutions, axkit, mod_xslt, 
> xstlc etc.. and I have posted to the xslt users list and none of them are 
> really optimal for our needs / setup for various reasons.

  I think sharing the compiled stylesheet between all your apache processes
would be the #2 approach to follow. I have only very old remains of Apache
memory model, but in a thread based setup that should be doable.

> The best solution for us and indeed most small multi-site development 
> companies, would be to be able dump/save/cache in some way (disk would be 
> ideal), a complete xslt processor after it has loaded the stylesheet. On  a 
> request, get libxsl to 'import' this  pre-built xslt processor, at which we 
> can throw our xml to be transformed. 

  It's unlikely you would gain much, sorry. The compilation process is 
actually relatively lightweight and the XML parser itself is extremely
fast. An ad-hoc implementation of a parser for a data format that has 
no standard and hence not much use would be certainly poor and likely
not much faster.

> First of all is this at all possible and secondly is it even a good idea?

  I don't think it's a good idea, and the possibility sounds feasible 
but an awful lot of work, really, go have a look at libxslt internal
data structures, you would ahve to save the tree too since it's still used,
yeah really it's unlikely to be faster.

> It's really not an overall performance thing, it's more about per request 
> latency. We aren't in a position to trade memory for latency due to how PHP 
> works, but we think with something like this, we could reach a more than 
> acceptable latency.

  You really should keep the compiled stylesheet in memory as a single
instance, libxslt in itself allows this, it can be use by parallel threads
running a transformation with the same stylesheet.
  The last possibility #3 would be to profile the compilation process
based on xsltproc and some runtime profiler and try to find out if your
XSLT code doesn't trigger some non-linear behaviour, but again this would
take time, and learning your XSLT and trimming it down seriously still
sounds the sanest approach.

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]