Re: Yelp document chunking



On Tue, 2003-05-20 at 03:45, Mikael Hallendal wrote:
> tis 2003-05-20 klockan 04.17 skrev John Fleck:
> 
> > Is it reasonable to have the chunked atom be one level below the root
> > element of the doc? So an article would be chunked at the sect1 level,
> > while a book would be chunked at the chapter level? Is that the sort of
> > thing you're thinking?
> 
> This solution sounds very nice to me. 

Upon thinking about it, I've decided that I really like this "flat"
approach.  The sidebar would then be a simple list, rather than a tree. 
This would also make it easier to present the sidebar differently, which
could be useful for any UI redesign that comes about for bug #91610.  On
a related note, I would like the toc and front matter to be listed in
the sidebar.

However, the one problem I have with the flat approach is that Yelp
doesn't currently support scrolling to fragments.  So let's say I've
written an application which has a dialog with a Help button.  This Help
button points to Section 3.5.7 of my document.  With the flat approach,
you'll just get Section 3 loaded in Yelp, and you'll be looking at the
very beginning of it.  This is a problem.

We should have fragment identification.  So the URI

file:///path/to/file.xml?section3#section357

should load the chunk with id 'section3' and scroll to the node with id
'section357'.  However, this is only a partial solution.  App developers
shouldn't have to worry about how we're chunking.  They should be able
to have the Help button point to

file:///path/to/file.xml?section357

Then Yelp should be able to figure out that 'section3' is the relevant
chunk, and that 'section357' is a place to scroll to in that chunk.  If
we can implement the former (which really, really ought to work), then
the latter can be accomplished by setting up a link translation table.

<translation_table file="file:///path/to/file.xml">
   <translation>
      <source chunk="section357"/>
      <destination chunk="section3" fragment="section357"/>
   </translation>
</translation_table>

Of course, it doesn't have to be an XML format.  It could just be a hash
table in memory.  However, generating an XML translation table would be
fairly simple with XSLT.  On the other hand, it's probably not too
difficult to do it in C with libXML either.

How easy this is to implement depends on how well gtkhtml supports it. 
I don't really know gtkhtml very well, so it would take me a little
while even to know if I can implement this.  On the other hand, there
were three others interested in hacking on Yelp.  So if anybody is just
looking for something to do, this would be good.

For those interested, this is an important feature regardless of how we
end up chunking the documents.  See, for instance, my rather half-ass
solution to bug #87595.

--
Shaun




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]