Re: The fast version of gtk-doc.xsl

On Mon, 2004-01-05 at 10:45, Owen Taylor wrote:
> On Sun, 2003-12-28 at 18:31, Callum wrote:
> > First of all let me apologise for the binary attachment. Inside the
> > tarball there is a drop-in replacement for gtk-doc.xsl. It is a factor of
> > 20 faster than the docbook xsl and supports everything that glib, gdk and
> > gtk use. It isn't perfect yet, but most of its problems are in the HTML it
> > produces rather than its interpretation of docbook. It is also rather
> > messy.
> > 
> > It handles all the cross-references and links perfectly. It produces an
> > index, but since I'm not entirely sure what is meant to be indexed, this
> > doesn't really work. It doesn't handle the devhelp stuff either, but I
> > think that fixing that is just a matter of an include and a call-template.
> > 
> > I haven't yet written a DTD for what it supports, but there is a document
> > to tell humans what it supports.
> > 
> > Ultimately I'm not sure that using it as a general replacement for all
> > gtk-doc is appropriate but creating a --fast option for people who are
> > confident that their docbook conforms to this subset could be a good idea.
> Wouldn't it be a lot more useful to work on figuring out just what parts
> of the standard DocBook stylesheets make it slow.

I see this phrase tossed around quite a bit.  There is no such thing as
standard DocBook stylesheets.  The stylesheets you're referring to were
developed by Norm Walsh, and are now maintained by some developers on
soureforge.  They are not developed or endorsed by the DocBook Technical
Committee or any other OASIS group.  Venerable DocBook wizard though he
is, Norm's involvement on a DocBook-related project does not make it a

Calling those stylesheets standard is like calling GTK+ the standard X11
toolkit or calling libxml2 the standard XML processor.

> There are two basic possibilities, as I see it:
>  - The slowness is completely due to the size of the stylesheets. 
>    Bigger stylesheets are slower. This would indicate most likely
>    fixable problems in the libxslt code.

This isn't really a problem.  The load time on the stylesheets is really
negligable compared to other problems.  We're talking fractions of a
second here.

>  - There is some particular feature in the DocBook stylesheets that
>    is causing slowness. If this feature isn't used in the gtk-doc
>    output, then again it is likely a bug in the libxslt code.
>    If the feature is "used" in some way, and can't be made faster,
>    then perhaps a parameter could be added to the XSLT stylesheets
>    to disable just that one feature.

This isn't really the problem either.  It's not top-level fluff features
that you can just turn off.  It's deeper than that.  The speed problems
start at the very core.

XSLT is not designed to be optimized.  To write fast stylesheets, you
have to develop with speed as your primary goal from the outset.

There are all sorts of problems that are very difficult to get rid of
with some customizations.  Result trees are passed through parameters
and copied multiple times.  Sometimes they're even re-processed using
exsl:node-set.  Selections are routinely made on the descendant axis. 
Section ordering and labelling is very non-optimal.

> But it general, if you can write a new stylesheet that is 20x faster,
> than that's some evidence that it should be pretty simple to make the 
> existing stylesheets 5x faster.

By the time you finished making those stylesheets sufficiently fast,
you'll have replaced 3/4 of the XSLT.  You'll have dug into templates
that aren't intended to be public API, so there will be a good chance
your customizations will break on the next version.

> I don't think we really want to be in the position of maintaining
> a DocBook subset and a new set of stylesheets for that subset.

I don't think you want to be in that position either.  I'm not all that
fond of being in that position myself.  DocBook is huge.  I also suspect
that by the time you get these stylesheets complete enough, they'll slow
down significantly.  Simple stylesheets that just turn <para> into <p>
and the like are of course going to be fast.

You'll start to get bogged with stuff like section numbering, finding
the appropriate chunk and fragment for cross-references, table-handling,
footnotes, indexes, glossaries, and other fun stuff.  To get this stuff
fast, you have to be paying attention the whole time.  You have to know
how XSLT works and what things are likely to slow you down.  And you
have to profile.  Profile, profile, profile.

All of that was the generic "you", of course.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]