Re: The fast version of gtk-doc.xsl



On Mon, 2004-01-05 at 13:27, Shaun McCance wrote:
> On Mon, 2004-01-05 at 10:45, Owen Taylor wrote:
> > On Sun, 2003-12-28 at 18:31, Callum wrote:
> > > First of all let me apologise for the binary attachment. Inside the
> > > tarball there is a drop-in replacement for gtk-doc.xsl. It is a factor of
> > > 20 faster than the docbook xsl and supports everything that glib, gdk and
> > > gtk use. It isn't perfect yet, but most of its problems are in the HTML it
> > > produces rather than its interpretation of docbook. It is also rather
> > > messy.
> > > 
> > > It handles all the cross-references and links perfectly. It produces an
> > > index, but since I'm not entirely sure what is meant to be indexed, this
> > > doesn't really work. It doesn't handle the devhelp stuff either, but I
> > > think that fixing that is just a matter of an include and a call-template.
> > > 
> > > I haven't yet written a DTD for what it supports, but there is a document
> > > to tell humans what it supports.
> > > 
> > > Ultimately I'm not sure that using it as a general replacement for all
> > > gtk-doc is appropriate but creating a --fast option for people who are
> > > confident that their docbook conforms to this subset could be a good idea.
> > 
> > Wouldn't it be a lot more useful to work on figuring out just what parts
> > of the standard DocBook stylesheets make it slow.
> 
> I see this phrase tossed around quite a bit.  There is no such thing as
> standard DocBook stylesheets.  The stylesheets you're referring to were
> developed by Norm Walsh, and are now maintained by some developers on
> soureforge.  They are not developed or endorsed by the DocBook Technical
> Committee or any other OASIS group.  Venerable DocBook wizard though he
> is, Norm's involvement on a DocBook-related project does not make it a
> standard.
> 
> Calling those stylesheets standard is like calling GTK+ the standard X11
> toolkit or calling libxml2 the standard XML processor.

If you have another set of stylesheets that handle DocBook (whole thing)
that can be used by us rather than Normal Walsh's set, then I might
consider changing my terminology... :-)

> > There are two basic possibilities, as I see it:
> > 
> >  - The slowness is completely due to the size of the stylesheets. 
> >    Bigger stylesheets are slower. This would indicate most likely
> >    fixable problems in the libxslt code.
> 
> This isn't really a problem.  The load time on the stylesheets is really
> negligable compared to other problems.  We're talking fractions of a
> second here.

Load time isn't necessarily the only reason that big stylesheets could
inherently be slow stylesheets, but that's good to here.

> >  - There is some particular feature in the DocBook stylesheets that
> >    is causing slowness. If this feature isn't used in the gtk-doc
> >    output, then again it is likely a bug in the libxslt code.
> >  
> >    If the feature is "used" in some way, and can't be made faster,
> >    then perhaps a parameter could be added to the XSLT stylesheets
> >    to disable just that one feature.
> 
> This isn't really the problem either.  It's not top-level fluff features
> that you can just turn off.  It's deeper than that.  The speed problems
> start at the very core.
> 
> XSLT is not designed to be optimized.  To write fast stylesheets, you
> have to develop with speed as your primary goal from the outset.

I'm highly skeptical of that claim. Not with any good practical
basis, mind you, but languages aren't "designed to be optimized".
If something is really horribly slower than it could be (by rewriting
it), then it's very likely that one small part is responsible for 
that slowness.

> There are all sorts of problems that are very difficult to get rid of
> with some customizations.  Result trees are passed through parameters
> and copied multiple times.  Sometimes they're even re-processed using
> exsl:node-set.  Selections are routinely made on the descendant axis. 
> Section ordering and labelling is very non-optimal.

Well, probably one or two of these problems are responsible for the
slowness on the gtk-doc code. Lets get those fixed and put those
fixes upstream.

> > But it general, if you can write a new stylesheet that is 20x faster,
> > than that's some evidence that it should be pretty simple to make the 
> > existing stylesheets 5x faster.
> 
> By the time you finished making those stylesheets sufficiently fast,
> you'll have replaced 3/4 of the XSLT.  You'll have dug into templates
> that aren't intended to be public API, so there will be a good chance
> your customizations will break on the next version.

I'm not talking about customizations. I'm talking about fixing
it upstream. In the good old open source fashion.

> > I don't think we really want to be in the position of maintaining
> > a DocBook subset and a new set of stylesheets for that subset.
> 
> I don't think you want to be in that position either.  I'm not all that
> fond of being in that position myself.  DocBook is huge.  I also suspect
> that by the time you get these stylesheets complete enough, they'll slow
> down significantly.  Simple stylesheets that just turn <para> into <p>
> and the like are of course going to be fast.
> 
> You'll start to get bogged with stuff like section numbering, finding
> the appropriate chunk and fragment for cross-references, table-handling,
> footnotes, indexes, glossaries, and other fun stuff.  To get this stuff
> fast, you have to be paying attention the whole time.  You have to know
> how XSLT works and what things are likely to slow you down.  And you
> have to profile.  Profile, profile, profile.
> 
> All of that was the generic "you", of course.

Well, for gtk-doc, we need at least cross-references, tables, and
indices... it's not clear to me that there is anything about the 
processing we need to do for gtk-doc that is essentially simpler than
the general case. After all, we are basically talking about writing
large, complex books here with huge amounts of cross-referencing.

Regards,
						Owen





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]