Our (real) problems



Reading the flame-wars and discussions currently raging on the gnome
lists makes me sad. Disagreement on technical decisions always exists
in projects, but the level of the discussion is way too low and the
topics are not what i perceive as the actually important ones.

Take for instance the discussion on whether gnome-libs should use the
GConf API or the Bonobo-conf API to access the configuration
database. The arguments are heated on both sides, but let me clue you
in on a little secret. It doesn't really matter!

The whole problem is in a category I would like to call "convenience
problem". If I were to flip a coin today and decide on GConf or
Bonobo-conf the only actual effect of the decision would be some
convenience/inconvenience for the current and future maintainers of
gnome-libs (depending on which API the maintainer favors). The fact
that gnome-libs uses either GConf of bonobo-conf will not make our
users and app developers laugh at Gnome or leave Gnome in disgust.

I spend my time these days doing mostly two things. I try to take the
current stable gnome release and turn it into a stable production
quality desktop for the next version of Red Hat, and I work on the
unstable version of Gtk+ which is gonna be the base for the gnome 2
release. This gives me a certain insight into the problems the gnome
development platform currently has.

Here is a list problems that I think are serious enough to make our
users fall on their ass laughing if we do not fix them. I don't think
any of them are unsolvable, but I see far to little work being done on
them, and far to much time spent "interesting" peripheral projects.

* Gtk+ 2.0 is way to late

We have time and time again promised that Gtk+ will be finished soon,
and each time we fail to deliver. We didn't even make our last API
freeze, instead having an outstanding list of API bugs that just
recently had more than 30 bugs in it.

Gtk+ 2 is a core part of the Gnome 2 platform, and the longer it is
delayed the more people are gonna be disappointed in us and go
somewhere else.

* GObject has severe performance problems

At the core of Gtk+ 2 sits the GObject object/class/type-system. This
is the code that handles the important things needed by everyone such
as types, class derivation, interfaces and signals.

The problem is that GObject is very complex and quite slow. Signal
emissions costs are up a factor of 10 from gtk+ 1.2, and when
profiling test application I've seen as much as 40% of the time spent
in gtype.c.

This is very bad since every application in the Gnome 2 platform will
use this as the base of everything. Imagine the performance of such a
desktop... 

* Object activation is unstable

OAF / Bonobo Activation has problems with the lifetime of servers. If
oafd starts a server and then dies/quits and then later starts again,
it has forgotten the fact that the server is already running. This
leads to severe problems with servers that are singletons, such as
gconf, and generally leads to wasted resources.

One common example of this is nautilus. Whenever nautilus dies,
due to a crash or due to the user killing it there are a bunch of
processes left running. Then when nautilus starts again duplicates of
these are started creating much havoc. I often get a bunch of
nautilus-news processes running that makes the sidebars crash when I
start nautilus.

The first step toward fixing this is to make sure oafd records the
IORs of the servers it starts on stable storage so that when it is
restarted it knows about them. The second step is making oafd so damn
stable that it never crashes. Even this does not help all the way 
though, because there is no good way for a restarting oafd to know if
the servers it has previously started are still alive and
working. This problem sort of ties in with the next issue.

* Bonobo distributed refcounting does not work

The memory management model of Bonobo object is the well know
ref-counting model. To signal that you are accessing an object you
ref() it, and when you're done with it you unref() it. When there are
no more references to an object it gets freed. This model works well
in gtk+, because if you fail to unref() an object this will only
result in a memory leak, and that memory (and other resources such
as open files) will be freed by the operating system when the
application exits (normally or due to a crash).

When you add out of process bonobo components to this however, things
look different. Suddenly you have the owner of a reference to an
object in another process, and when that process terminates due to a
crash or just forgets to unref() the objects the process is suddenly
"leaked". I.e. it will never terminate, because it still believes
someone needs it. The operating system can never clean this up.

Even in a perfect world where programs never crash due to bugs this is
a problem. People randomly kill apps, log out using ctrl-alt-backspace
etc. The current system just cannot work.

The current "solution" for this is that each application ships with a
gruesome hack of a script that you have to run now and then to clean
up (examples: killev, nautilus-clean.sh and oaf-slay). This sort of
works right now, because you typically only run one major application
using bonobo (nautilus or evolution), but if you were to run a desktop
that used bonobo pervasively this would lead to chaos. When
gnome-calculator crashes you have to run calc-slay, thereby hosing
your entire system. This gets even worse due to the previous issue,
since this normally includes killing oafd, therefore fucking up all
singleton servers running (including gconf). 

Darin did some early work on using leases instead of ref-counting,
which may be one solution to this problem. But no work has been done
on this since then.

* Bonobo UI handler is slow and uses much memory

The bonobo ui handler works by sending xml data over Corba to
specify the application window ui (menus, toolbars etc). This is a
very flexible and easy way to handle user interface creation and ui
merging. Unfortunately it does currently have a performance
problem. The result of this is that opening new windows and switching
"components" (i.e. in evolution) is slow, which gives a very bad
performance impression. When people do something they need to get
instant response, or they will feel that the application is slow and
get irritated.

As an example of this we tested GMC on a Pentium II 300 MHz machine vs
Nautilus on a dual Pentium III 1 GHz machine. Opening a new window in
Nautilus was pretty fast, but opening a new window in GMC was
*instant*. The comparison is not entirely fair, and all blame is
clearly not on the UI handler, but it demonstrates that this is an
important issue for end users, and may make all our apps look slow if
not fixed.

Some work has been done on this recently by Michael Meeks and me, but
I still think there is lots of work to be done here.

* Too many libraries

Our current development platform really has to many
libraries. Applications on the higher levels in the dependency chain 
drag in a ridiculous amount of libraries. This causes performance
problems with symbol lookup and increases startup time. Part of the
startup time can be fixed by using ELF prelinking, but part of it is
inevitable.

One of the RedHat toolchain people and ELF guru (Jakub Jelinek) said
this:
> > > So how long will it take till an average GNOME or KDE program will need half
> > > the number of shared libraries than now?
> > 
> > /me points at gnucash and laughs again.
>
> Oh crap, 59. Shouldn't we start talking about efficiency to those folks? 
> I'll check tomorrow how many shlibs in the distribution are used just once,
> I think it will be more than 50%.
> I mean IHVs are very happy about this, when XYZ app is slow, just buy a faster
> CPU/more memory.


End of rant.

Note, if you are currently working on any of these issues, don't see
this as a flame, but rather as constructive criticism on the current
state.

/ Alex






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]