Re: Imp/Exp filters and BFD: was (Decoupling Gnome Office)



On Sat, Sep 02, 2000 at 01:51:33AM +1100, David T. Bath wrote:
> Perhaps the most difficult part of developing OSS
> office suites for standard hackers is the ability to
> deal with foreign formats, such as MS-Word or WordPerfect.
It is certainly non trivial.

> The StarOffice guys obviously did a brilliant job of
> grokking the format, and perhaps only companies with the
> resources of StarOffice or Sun can keep track of what
> the evil empire is doing with file formats.
A couple of dedicated people with lots of tests cases can do it.
Gnumeric has made a pretty reasonable attempt at this for MS XL
files.  We've had to do alot of guessing about things that I'll
charitably refer to as 'under documented' but it is feasible.

> What is the possibility of the Import/Export filters
> from StarOffice being excised as separate executables
> or libraries that can be called by WHATEVER program
> wants to read/write them.
Like many projects this works nicely for simple things and gets
exponentially difficult as you attempt to produce something that
will be fully compatible.  The main difficulties are
    1) The file format implies a certain implementation/perception
       of what features and capabilities are available to the display
       engine.
    2) MS Office files embed sub-streams of other components.

A simple example of (1) are 'array formulas' and 'natural language
expressions' in MS XL.  They are both somewhat obscure features in
XL that are definitely used by power users but are not available in
other spreadsheets.  Another instance would be XL's handling of
default row format.  Implementing it correctly requires the
implementation to behave in a very specific way, that is not always
obvious.

The main issue with (2) are so called 'escher' streams.  These are
the minimally documented file segments that handle drawing and
placing objects.  Every image, line segment, and chart background
are stored using this format.  It is certainly Not something that
belongs in a spreadsheet directly.  The project would require a full
'office file format' dtd.

Our experience with Gnumeric has been that it takes more time to
implement the features than to import them, and that export is a
much larger problem.

Such a project would certainly be interesting, but seems doomed to
be too heavy weight for simple usage, and too simple for real MS
Office compatibility.





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]