Re: [Scrollkeeper-devel] structure of extracted index page



hi Daniel

I am somewhat confused on this issue too. I sent a mail earlier today, that you 
may have not received, which relates to this (see attached).

The main point is the problem of having scrollkeeper generating ids relating to 
the original document. Specifically:
from what I understand the convertor at run time generates ids where required 
(using generate-id() ).  However,  I don't believe scrollkeeper 
can predict what these ids will be because:
1) scrollkeeper may use a different convertor to the 'run-time' convertor.
2) even if the same convertor is used the w3.org description indicates that 
  "An implementation is under no obligation to generate the
   same identifiers each time a document is transformed."
   see http://www.w3.org/TR/xslt#function-generate-id

Is this a correct assumption or have I misunderstood this (which is entirely 
possible :-) )

cheers
Mary

> Date: Thu, 26 Apr 2001 08:32:35 -0400
> From: Daniel Veillard <veillard redhat com>
> To: László Kovács <laszlo kovacs Sun COM>
> Cc: veillard redhat com, Dan Mueth <dan eazel com>, Mary Dwyer 
<Mary Dwyer Sun COM>, scrollkeeper-devel lists sourceforge net, 
gnome-doc-list gnome org
> Subject: Re: [Scrollkeeper-devel] structure of extracted index page
> Mime-Version: 1.0
> Content-Disposition: inline
> Content-Transfer-Encoding: 8bit
> User-Agent: Mutt/1.2.5i
> 
> On Thu, Apr 26, 2001 at 12:13:25PM +0100, László Kovács wrote:
> > > > Yes, sections with no IDs are ignored  during TOC extraction.
> > > 
> > >   Hum, this should probably be improved ...
> 
>   That is sure.
> 
> > > C.f. my other mail for a possible technical solution.
> 
>   Now whether XPointer can help w.r.t. the addressability then it really
> depends what need to be addressed.
> 
> > Are you talking about the XPointer email? I dont understand how that
> > helps us here. Our problem is that if a section does not have a unique
> > id then Scrollkeeper and gnome-db2html[2|3] can jump there only if they
> > generate an id to this section which is the same in both Scrollkeeper
> > and the convertor.
> 
>   What do you mean by jump here:
>     - if the here is an XML document, then I assume the document is
>       handled by libxml and hence XPointer can be used
>     - if the here is an HTML document, then XSLT has a 
>       generate-id() function which can be used to generate a unique ID
>       for this element, and pointing is also possible using the existing
>       #name framework.
> 
> Did i missed something ?
> 
> Daniel
> 
> -- 
> Daniel Veillard      | Red Hat Network http://redhat.com/products/network/
> veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
> http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

~ I speak for myself, not for my employer ~
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Mary Dwyer
Desktop Applications & Middleware Grp
Sun Microsystems Ireland
Tel: +353-1-8199222 (xt 19222)
Fax: +353-1-8199078
email: mary dwyer ireland sun com
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
--- Begin Message ---
Delete.
This is a system message.                                















--END+PSEUDO--

>From Mary Dwyer sun com  Thu Apr 26 10:20:11 2001
Received: from sunire.Ireland.Sun.COM (sunire [129.156.220.30])
	by ireserver.Ireland.Sun.COM (8.8.8+Sun/8.8.8/ENSMAIL,v2.1p1) with ESMTP id KAA12078;
	Thu, 26 Apr 2001 10:20:11 +0100 (BST)
Received: from sunmail2.Sun.COM (sunmail2.EBay.Sun.COM [129.150.166.10])
	by sunire.Ireland.Sun.COM (8.9.3+Sun/8.9.3/ENSMAIL,v2.1p1) with ESMTP id KAA01349;
	Thu, 26 Apr 2001 10:20:09 +0100 (IST)
Received: from saturn.sun.com (saturn.EBay.Sun.COM [129.150.69.2])
	by sunmail2.Sun.COM (8.9.3+Sun/8.9.3/ENSMAIL,v2.1p1-Sun.COM.mod.2) with ESMTP id CAA28719;
	Thu, 26 Apr 2001 02:20:07 -0700 (PDT)
Received: from usw-sf-list1.sourceforge.net (usw-sf-fw2.sourceforge.net [216.136.171.252])
	by saturn.sun.com (8.9.3+Sun/8.9.3) with ESMTP id CAA25665;
	Thu, 26 Apr 2001 02:20:07 -0700 (PDT)
Received: from localhost ([127.0.0.1] helo=usw-sf-list1.sourceforge.net)
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.22 #1 (Debian))
	id 14shwQ-0003ip-00; Thu, 26 Apr 2001 02:20:02 -0700
Received: from mercury.sun.com ([192.9.25.1])
	by usw-sf-list1.sourceforge.net with esmtp (Exim 3.22 #1 (Debian))
	id 14shvh-0003g4-00
	for <scrollkeeper-devel lists sourceforge net>; Thu, 26 Apr 2001 02:19:17 -0700
Received: from ireserver.Ireland.Sun.COM ([129.156.220.7])
	by mercury.Sun.COM (8.9.3+Sun/8.9.3) with ESMTP id CAA02161;
	Thu, 26 Apr 2001 02:19:14 -0700 (PDT)
Received: from legion (legion [129.156.238.39])
	by ireserver.Ireland.Sun.COM (8.8.8+Sun/8.8.8/ENSMAIL,v2.1p1) with SMTP id KAA11847;
	Thu, 26 Apr 2001 10:19:13 +0100 (BST)
Message-Id: <200104260919 KAA11847 ireserver Ireland Sun COM>
From: Mary Dwyer <Mary Dwyer sun com>
Reply-To: Mary Dwyer <Mary Dwyer sun com>
Subject: Re: [Scrollkeeper-devel] structure of extracted index page
To: Mary Dwyer sun com, dan eazel com
Cc: scrollkeeper-devel lists sourceforge net
MIME-Version: 1.0
Content-Type: TEXT/plain; charset=us-ascii
Content-MD5: yFp1ac1FO4H/nq7VEbSo/A==
X-Mailer: dtmail 1.3.0 @(#)CDE Version 1.4 SunOS 5.8 sun4u sparc 
Sender: scrollkeeper-devel-admin lists sourceforge net
Errors-To: scrollkeeper-devel-admin lists sourceforge net
X-BeenThere: scrollkeeper-devel lists sourceforge net
X-Mailman-Version: 2.0.3
Precedence: bulk
List-Help: <mailto:scrollkeeper-devel-request lists sourceforge net?subject=help>
List-Post: <mailto:scrollkeeper-devel lists sourceforge net>
List-Subscribe: <http://lists.sourceforge.net/lists/listinfo/scrollkeeper-devel>,
	<mailto:scrollkeeper-devel-request lists sourceforge net?subject=subscribe>
List-Id: <scrollkeeper-devel.lists.sourceforge.net>
List-Unsubscribe: <http://lists.sourceforge.net/lists/listinfo/scrollkeeper-devel>,
	<mailto:scrollkeeper-devel-request lists sourceforge net?subject=unsubscribe>
List-Archive: <http://lists.sourceforge.net/archives//scrollkeeper-devel/>
Date: Thu, 26 Apr 2001 10:16:20 +0100 (BST)
Content-Length: 5836
Status: RO
X-Status: $$$$
X-UID: 0000000001

Hi Dan

yes, it appears ids are optional for indexterms (and not used for "see" and "see 
also").
>From what I understand the convertor at run time generates ids where required 
(using generate-id() ).  However, as you point out, I don't believe scrollkeeper 
can predict what these ids will be because:
1) scrollkeeper may use a different convertor to the 'run-time' convertor.
2) even if the same convertor is used the w3.org description indicates that 
  "An implementation is under no obligation to generate the
   same identifiers each time a document is transformed."
   see http://www.w3.org/TR/xslt#function-generate-id

Am I correct in these assumptions?


Your point about generating ids for use within the index doc (for "see" and "see 
also") makes sense and I will incorporate this


cheers
Mary


> 
> My understanding is that id's are optional on all the indexing tags of
> interest here: indexterm, see, seealso (and maybe others...)
> 
> Do you know what collateindex.pl does with indexterms which do not have
> id's?  I tried to make a small test document which generates a nice index
> but wasn't successful within just a few minutes.  Do you have a test
> document handy we can play with?  I am guessing that it can generate an
> index independent of whether the indexterms have id's.
> 
> It is certainly convenient if all the indexterms have id's, as it is
> easier to link to them from the index.  This is very similar to how the
> TOC links to section id's though.  Jade deals with sections without id's
> by assigning id's to them.  It can do this because it is generating both
> the anchors and the links in the same output.
> 
> Suppose we continued to use on-the-fly conversion from SGML to HTML:
> 
> The difficulty with ScrollKeeper creating an index off of an SGML
> document, or even a TOC off an SGML document, which doesn't have id's is
> that the generated index or TOC has to predict the id's (ie. anchors)
> which will be assigned to those sections or indexterms by the converter at
> run time.  If we know how this assignment will be done, we are ok.  
> Otherwise, we must require that all sections and indexterms (and <see> and
> <seealso>) have id's.  This is what we are doing now in the GDP, but is
> not really a great solution since we are making further restrictions on
> top of DocBook.  Thus SK would not work with just any DocBook doc, but a
> certain subset of all DocBook docs.  So the better solution is to come up
> with a scheme which will assign id's in a predictable way.  This method
> would be used by ScrollKeeper during the index creation and during
> gnome-db2html2/gnome-db2html3 during display.  The downside to this is
> that ScrollKeeper would need to know in advance which display system will
> be used.  So long as GNOME and KDE follows the system used by
> collateindex.pl, we should not have any problems.
> 
> Laszlo - Does this all sound correct based on your experience?  How do you
> handle sections without id's in ScrollKeeper's TOC extraction?  Do you
> ignore those sections or id them in the same way as db2html?
> 
> The other possibility is that instead of trying to refer to an anchor in
> the generated HTML, we try to refer to the position in the XML document.  
> I really don't know how this would work exactly, since I am not very
> familiar with libxml, but it may be possible.  (DV?)
> 
> Dan
> 
> 
> 
> On Wed, 25 Apr 2001, Mary Dwyer wrote:
> 
> > hi
> > 
> > I'd appreciate some feedback/suggestions on the structure of the index 
> > scrollkeeper will create from a document.
> > 
> > To aid explanation, consider a document including the following index 
markups:
> > 
> > <indexterm id="idx-a1">
> >   <primary>Apple</primary><secondary>Big 
</secondary><tertiary>Green</tertiary>
> > </indexterm>
> > 
> > <indexterm zone="a1"><primary>Orange</primary><secondary>Medium></secondary>
> > </indexterm>
> > 
> > <indexterm id="idx-a2" class=startofrange>
> >   <primary>Banana</primary><secondary>Small</secondary>
> > </indexterm>
> > <indexterm startref="idx-a2" class=endofrange>
> > 
> > 
> > 
> > 
> > The Example below is an excerpt from the extracted index .
> >  
> > 1. The tags <indexdoc> </indexdoc> indicate beginning and end of document
> > 2. The index entry is indicated by the tags <indexentry linkid="id"> 
> > </indexentry>
> > 
> > I do not know how to handle See and See Also references (as they are not 
> > associatied with an id) - any suggestions?
> > 
> > 
> > Example:
> > 
> > <indexdoc>
> >    <indexentry linkid="idx-a1">Apple, Big, Green
> >    </indexentry>
> >    <indexentry linkid="idx-a2">Banana, Small
> >    </indexentry>
> >    <indexentry linkid="a1">Orange, Medium
> >    </indexentry
> >       
> >    etc. .......
> >    
> > </indexdoc>   
> > 
> > 
> > 
> > TIA
> > Mary
> > 
> > 
> > 
> > 
> > ~ I speak for myself, not for my employer ~
> > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> > Mary Dwyer
> > Desktop Applications & Middleware Grp
> > Sun Microsystems Ireland
> > Tel: +353-1-8199222 (xt 19222)
> > Fax: +353-1-8199078
> > email: mary dwyer ireland sun com
> > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> > 
> > 
> > _______________________________________________
> > Scrollkeeper-devel mailing list
> > Scrollkeeper-devel lists sourceforge net
> > http://lists.sourceforge.net/lists/listinfo/scrollkeeper-devel
> > 
> 
> 

~ I speak for myself, not for my employer ~
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Mary Dwyer
Desktop Applications & Middleware Grp
Sun Microsystems Ireland
Tel: +353-1-8199222 (xt 19222)
Fax: +353-1-8199078
email: mary dwyer ireland sun com
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


_______________________________________________
Scrollkeeper-devel mailing list
Scrollkeeper-devel lists sourceforge net
http://lists.sourceforge.net/lists/listinfo/scrollkeeper-devel


--- End Message ---


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]