Plan for the YelpDocument branch



A long while ago (okay, a couple of weeks ago), I asked Don to
post an outline of his plan for the yelp-spoon branch, and said
I'd do the same for the yelp-document branch.  Don lived up to
his end, so here's mine.

YelpDocument will be replacing YelpPager as the API for getting
HTML pages from stuff.  YelpPager's API basically assumes that
we will process a given document once, and it fires off signals
every time it gets a page.  This has disadvantages:

1) Pre-paginated documents (e.g. Mallard) are forced into a
model that just doesn't make sense.  The price of this will
be performance penalties.

2) YelpWindow (or any other hypothetical pager consumer) has
to have a lot of logic to track which page it wants and when
it's available and all that crap.

3) There's no way for the pager to know which pages are being
viewed, or when pages were last viewed, or anything like that.
This makes it very difficult to clean up.

4) Same statement as above.  This makes it impossible for the
pager to push updated versions of a page. This means that reload
logic has to be handled inside YelpWindow, and it has no way of
signaling to other windows that they should also reload.

The YelpPager API was my brainchild from way back in 2003 when
I first took over Yelp.  It encapsulated the way Yelp already
worked into a single API, so that YelpWindow wouldn't have to
jump around for different kinds of documents.

Four years is a long time to learn from one's mistakes.  (Wow,
four years.  I didn't even realize it'd been that long until
I just looked at the SVN history.)

Basically, a YelpDocument represents a single document.  It
makes no assumptions about how or when we transform stuff.
YelpWindow requests a page by the page's ID, providing a
single callback function, and getting a request ID back.

YelpDocument calls that callback whenever it's got something
to say about that page.  That could be immediately, or after
we've run some XSLT, or after we've called Beagle or Tracker,
or after we've sent an automatic email to Joachim asking him
to write some documentation and he replies to the email with
complete documentation which is then loaded into Yelp. ;-)

When a YelpWindow moves on, it releases its page request using
the request ID it got when it made the request.  This tells the
YelpDocument to stop sending information about that page.  The
YelpDocument may or may not free the contents when there are
no more open requests, forcing it to do whatever it needs to
do next time the page is requested.  It might implement some
logic where it frees data after a certain amount of time of
not being accessed.  The point is, YelpWindow doesn't care.

YelpWindow will, of course, have to be changed considerably to
use YelpDocument.  Preferably, we'd want a factory API where
we can hand it a URI/filename and just get a YelpDocument back.
Since a YelpDocument could hold whatever metadata it needs to,
there's really no reason for YelpDocInfo.  (At least not in the
header files.  A small struct might be convenient in the TOC.)

I had envisioned the following document types:
  YelpDocbook: a single DocBook document
  YelpInfo:    a single info document
  YelpInfoToc: the TOC listings for info pages
  YelpMallard: a single Mallard document
  YelpMan:     a single man page
  YelpManToc:  the TOC listings for man pages
  YelpSearch:  a singleton document managing all searches
  YelpToc:     the main TOC listings

Clearly, the three bits that deal with TOCs needs to be
coordinated with Don's Spoon work.  But the rest of them
should be largely independent of that work.

I have YelpDocbook basically working right now.  You can
check out the yelp-document branch and 'make test-document'
to see it in action.  Another goody is YelpTransform, which
encapsulates libxslt into a single callback-based API, which
happens to be threaded internally.  I expect YelpTransform
will reduce the amount of code in most documents.

--
Shaun





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]