Hi,
I tried to run a number XSLTs that were generated by a mapping software. These XSLTs are huge, containing >150k templates, ~100k parameters, and ~10k variables. Despite the size the execution time is actually all fine. But the time to initially compile the stylesheet is >35 s (Linux Xeon5690). That's when i got callgrind out to find where the hotspot is.
I think i have tracked it down to the following code in xslt.c where xmlStrEqual is called in a very long running loop:
https://github.com/GNOME/libxslt/blob/master/libxslt/xslt.c#L5410
I actually ran the measurements with 1.1.28, the git.gnome.org, www.github.com/GNOME/libxslt and the forks of one or two other people on github. In all cases the compilation of my stylesheet is very long and each time they point to the same or similar code.
At the moment i do not have a deep understanding of the libXSLT code. My current guess is, that this point at which the template name is validated to be unique in the stylesheet. The current algorithm seems to be walking linearly through the list of all already created templates and is comparing the name with each. Given that the compilation process is walking through all templates this loop means that we have an O(n^2) algorithm (with n being the number of template instances in the stylesheet to compile).
The huge number of templates in my XSLTs are just so far over the edge, that the compilation takes 35s. I ran a test in which i skipped the loop. This reduced the compilation time to below 2.5 s.
Would anyone let me know if i have understood the code? What can i do to improve the code that would get easily accepted and released? I am open to any kind of suggestions on what to do to speed this validation step up with a data structure or mechanism already existing ?
What is now the developer or working repository for libxslt?
I found the git repository on git.gnome.org, the GNOME/libxslt on github, plus several forks which seem to be working repos. Can you point me please to the repo against which i should use for further investigation or patches?
Best regards,
Christian