Re: [xslt] speeding up xslt in gtk-doc



On Tue, May 03, 2011 at 06:14:44PM +0300, Stefan Kost wrote:
> hi,
> 
> I am still not happy with the time it takes for gtk-doc to generate the
> html. I did some profiling and tired various changes. Its not too
> impressive. Just posting them here for feedback and comment on the
> directions of what would be acceptable or not. I technically have commit
> rights, so I can push things once they are ready. If patch 0002 looks
> like a good idea, I can extend it for other accessors.

  Heretic idea, but what about not using the standard docbook
stylesheets for this but xsl tuned up to what you actually use and
need ?

> 
> Test case was generating the html for gstreamer-core api docs
> 
> real       user       sys
> ------------------------------ before
> 1m22.872s  1m17.317s  0m1.444s
> 1m19.261s  1m17.437s  0m1.384s
> 1m19.770s  1m17.417s  0m1.264s
> ------------------------------ 0001-xpath-annotate-branch-prediction.patch
> 1m18.123s  1m16.677s  0m1.256s
> 1m18.684s  1m17.181s  0m1.292s
> 1m18.887s  1m17.217s  0m1.364s

 Okay this have a bit of an effect at the expense of making the code
harder to read.

> ------------------------------
> 0002-xpath-split-traversal-into-init-and-next-functions.patch
> 1m18.537s  1m16.925s  0m1.388s
> 1m18.616s  1m16.913s  0m1.412s
> 1m18.249s  1m16.653s  0m1.328s

  this looks more harmful than anything else, make is more complex,
increase the number of local variables in a function already big
all that to avoid a test to NULL which takes no time. NACK

> ------------------------------
> 0003-xpath-avoid-a-memcpy-on-the-expense-of-temporarily-w.patch
> 1m17.481s  1m16.137s  0m1.208s
> 1m17.977s  1m16.481s  0m1.328s
> 1m17.791s  1m16.413s  0m1.232s

  why 100 then, try to adjust a bit.


  But really I would not try to chase improvement there. improving the
compilation of xsltstylesheet is way more likely to return actual gains.
For example, try to avoid dynamic lookup of variables at runtime,
most of those can be done at compilation time and this could avoid
lookups.

  But before optimizing, where is the time actually spent ?
My attemps with gprof in the past were kind of use less, I used
callgrind/kcachegrind like 5-6 years ago to improve speed and
using this and targetting obvious candidate, did indeed lead to
improvements.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]