Re: Open Document Environment (ODE)




Hi Kim & world,

Well here's my thoughts in no particular order:

0) I think a global mailing list for those projects interested in this is
in order.  I know I'll subscribe.  I'm sure I can get Roger from SEUL to
sponsor the list on his mail server unless someone feels strongly another
way.  odo@seul.org perhaps?  Or lodo for Linux Open Doc Org.?  

1) Having one specific format (Docbook for example) is great from an
indexing and searching perspecitive.  As the person in charge of the
LKBP's search engine, I've given a lot of thought to how I can make the
searches more meaningful to the user and more accurate.  One of the
biggest improvements in this area come when you have a standard format
such as Docbook and then give your search engine the intelligence to
understand it.  It's also a lot of work :)

2) Having one specific format is less than ideal from an authors
perspective.  Telling someone how to move a filesystem from one
partitition to another takes less than a page, and forcing them to use a
relatively complicated markup such as Docbook is a good way to get the
author to decide it's not worth the effort.  And of course there are a lot
of people who don't know things like Docbook and they don't really have
the time.  There are ways to get around this sort of thing, people who
manually convert them for the original author and things like that.

As it stands now, the LKBP is more of a generalized search engine- that is
to say, we don't care what the format of the document is, as long as it
can be represnted on a web page.

Also, if you're storing the content in a (SQL?) database it may be
preferable to use the fields of the database for structure rather than
using something like docbook for specifying the Title, Author, Date, etc
info as there are very few XML aware DB's out there.

Basically, we need to think this out really really well.

3) Licenses are a kicker.  The LKBP has a license based on the Netscape
Open Directory Licnese.  I think a license such as this is more applicable
to such a project, since it's the categorization of the content that makes
it valuable.  There is tens of thousands of docs for linux around the web,
the reason they're not of much use is that people can't find them.
Generally allowing a number of pre-approved licenses for the actual
content should be good enough.  Do we really want to read each person's
license (which may or may not be a well written license in the first
place) to make sure it meets our requirements? Not me.  That and license
creep (the creation of more and more licenses) is generally a bad thing.

4) Creating a new umbrella group is good.

5) Keeping a Linux slant on things IMHO is a good thing.  I assume that if
the FreeBSD users want's something like the LKBP or what you suggest they
can download the source code to the site, download the database, weed out
what they don't want and add in FreeBSD specific stuff.  This makes things
more meaningful for the end user.

6) The Bookshelf analogy has some merit, but I feel (at least right now,
subject to change without notice) that the basic categorization ala Yahoo
is the right way to go.  Yahoo has been sucessful for a number of reasons,
one of which is that it's simple to find what you're looking for.
Complicated analogies which have color coded sections and things like that
are confusing, and finding a solution to a problem should be simple and
streamlined, almost wizard-like.  Not to say a more powerful/advanced
search system isn't useful, but you need to provide a system that makes it
1-2-3 easy for "newbies".  Or maybe I'm the only person who can never find
what I'm looking for at the library. :)

7a) Local/LAN/WAN retreval is going to be hard.  To build a scalable
infrastructure like this is a huge effort.  I've talked to people about
things like this before, and looked at all sorts of ideas for doing it
(everything from doing it via DNS, LDAP, and a cluster of SQL servers).  
It also makes version control, re-categorization, and data replication
more many times more difficult since the content is distributed.  Realize
that 99.99% of the people will never install it locally and probably about
98% will not install it on their LAN.  So why increase the difficulty of
the project many times over for about 2% of the users?  I think there are
a number of more useful features that will be easier to impliment and will
help more people.  Keeping all the data on the sever makes life much
easier. 

7b) One thing I do think is really important though is allowing software
developers to write programs that query such a site.  Either general help
systems or application help systems.  This is pretty easy actually- you
write an OpenAPI running over http that queries the site and leave the
formatting up to the app.  This way people can continue to distribute
documents with software, but the user isn't limited to accessing only
those documents.  This is one of the projects for the LKBP, but our
developer on this has had to stop development for personal reasons.

8) Knowing what software is installed on a users computer and then showing
what documentation relates to that is rather interesting. Very interesting
actually.  It's also has a ton of privacy vs. usability issues too. Things
like RPM make this a real possibility though.  One of the things you could
do is create user accounts on the server, and allow people to upload their
RPM databases, then do a xref search based on that.  At least that way you
can allow people to op-in to something like this.  We could also provide
summary usage stats back to the authors/interested persons for each piece
of software- again, opt-in of course.  It would be very cool to pull this
off though.

9) Allowing the same document to live in two or more categories is key as
what makes sense to you or me might be different from someone else.
Each document should live wherever it makes sense.  One thing you have to
do though is make sure if someone does a search from a point in the tree
where the same document is multiple places below it, you *MUST* weed out
duplicates.

10) I haven't ever seen SGI's bookshelf thingy or heard anything about it.
Maybe you could take a few screenshots for those like me who have no
access to SGI/Irix??  Maybe after I see it I'll sing a different tune.

11) Having a common look and feel for all documents is good, but it really
depends on what you want to do though.  People who spend 10 minutes to
write a simple FAQ aren't going to concern themselves too much with look
and feel since it's not that important for a one page explaination.
Sometimes standardization is taken too far at the cost of ease of use.
Also do you not want to accept documentation just because it's not in the
"approved format/structure"?  man pages come to mind as one example.

12) It has to be scalable.  No point in doing this if you can't scale the
system/backend to 100,000+ documents of 1-1,000K in size.  Simple static
indexed sites such as the LDP or the Red Hat Linux Users FAQ
(http://www.pobox.com/~aturner/RedHat-FAQ/) won't cut it.  It's gotta have
a database for the backend, preferably SQL or possibly LDAP.

13) Should have a internal means of keeping revision history.  Not
required, but nice.

14) Document Browser should be a web browser initially. Once you start
requiring end users to install special software that has other
requirements (such as a viewer using the GTK or QT libraries), your
potential userbase shrinks.  Should be Lynx friendly.

15) I think overall this is a great idea.  I've thought about this for
sometime myself, but have been too distracted in trying to get the LKBP
out the door in at least an alpha state that I haven't had the time to be
an ambassador to create the interest in something of this magnitude.
Kinda the "If you build it they will come." mentality.  I'm more than
happy to jump on the bandwagon though.

Anyways, those are my ideas- sorry if they're schitzophrenic, but I've had
to write this email in 3 different sessions over a 24 hour period.  

--
Aaron Turner, Core Developer       http://vodka.linuxkb.org/~aturner/
Linux Knowledge Base Organization  http://linuxkb.org/
Because world domination requires quality open documentation.
aka: aturner@vicinity.com, aturner@pobox.com, ion_beam_head@ashtech.net
The difference between `Unstable' and `Usable' is only two characters: NT




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]