Re: [guppi-list] Re: use of GSL in guppi

On Tue, Sep 15, 1998 at 02:24:46PM +0200, Asger K. Alstrup Nielsen wrote:
> I don't know if this discussion is off topic.  If you feel it is,
> please tell me, and we'll take it to private mail.

It doesn't seem off-topic to me.  It is worthwhile to think about how
we can further the general goal of better free scientific software, as
well as the specific goal of improving Goose.

> > Yes and no. But if there is no direction we all walk to, we never meet
> > in the end. The problem I found with the existing implementations of
> > mathematical libraries is, that everybody introduced a new set of basic
> > datastructures and every library dictates a different programming
> > philospohy to the user. You will seldom mix different libraries.
> I agree that there is a need for a standard.  I argue that it is hard
> to obtain such using the monolithic approach unless you have many
> resources.
> > My goal is to provide a home for scientific related libraries and
> > applications. I want to give a set of deign patterns that we are able
> > to use all the different libraries, data formats and what ever in the
> > long run. And I want to provide a set of basic structures and
> > functions used in every application. But these functions are almost
> > there (in glib, gtk+, ...) 

I think that this isn't really possible right now, because I think
that there is a very good reason why the scientific software world is
so fragmented.  And it is *very* fragmented, despite a tradition of
"open source" and redistributed code in scientific and numerical
computing that goes way back before Stallman and the FSF.

The dilemma is that the disperate needs of the scientific computing
community are too diverse to be well-served by a monolithic,
one-size-fits-all library.  Scientific computing is very
performance-intensive, and solving large, difficult numerical problems
require special optimizations and different techniques.  Writing a
nice matrix library isn't too hard.  Writing one that is useful both
to someone who needs to invert a 5x5 matrix for solving a small
polynomial regression (which is about the extent of Goose's uses of
linear algebra right now) and to someone who needs to find eigenvalues
for ten thousand different 6000x6000 sparse matrices of a certain
class seems pretty much impossible.

Until either we get a lot smarter (more on this later) or processing
power reaches a point where using the naive, textbook algorithms will
allow you to solve the above-mentioned eigenvalue problem on your
PalmPilot in 3.2 seconds, the prospects for a Grand Unified Scientific
Library looks bleak.

The best thing for us to do is to all develop tools that are useful
for ourselves.  If they are also useful to others, other people will
help improve the tools and we might just end up with some really nice
free software.  And we should all certainly be mindful of engaging in
as much code-reuse as possible.

But the fragmentation and the lack of common data structures hasn't
happened because people in the scientific community are foolish or
short-sighted.  (Well, maybe they are, but no more so than the rest of
us.)  The problem is just that providing even a "generic" vector and
matrix class that is well-suited for both large and small numerical
problems is *hard*.  And most scientists, I think, would rather spend
time solving their problems than coming up with general solutions to
this problem.

Now I do think that we are making progress when it comes to being
smarter.  A very interesting project that I think shows a lot of
potential is Blitz++ (see  Generic
programming (a la templates) shows a lot more promise for constructing
the kind of universal data structures you are talking about than other
techniques.  But Blitz++, while very cool, seems to still be very much
in the R&D phase.

In any case, the data structures in glib are completely unsuitable for
scientific work.  That doesn't mean that they are bad... they were
just designed with other purposes in mind.

I think that a problem that is just as serious for scientific
computing is that much of what is out there is just *ugly*.  The
legacy of Fortran libraries from the sixties still haunts packages
like GSL.  Macro-magic to try to simulate templates.  Function names
like gsl_blas_dtrsv and gsl_blas_ctbmv.

One of my main design goals in Goose was to create something that was
simple and had a clean interface, and that spared you from worrying
about certain details.  (Hence the two copies of the data, one in
"natural order" and one sorted, stored in the DataSet.  You shouldn't
need to worry about if your data is sorted or not.  Likewise, the
library should help you out by minimizing the number of sorts that you
have to do....)   Because of this, Goose is precisely *not* a
universal solution.  It seeks to serve a particular "market segment"
very well, rather than to try to work for both me and the Census


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]