On GIMP and CORBA



This is a discussion of the current drawbacks of GIMP's extension
system and why and how it should be replaced by a CORBA-based
framework.

A couple of notes about terminology:

When I say "extension" I usually mean a remote process that
communicates with the main app in general. Thus it includes plug-ins.

When I say "PDB" I mean the whole system of Gimp's current extension
mechanism, from the "PDB proper", the table of procedure information in 
the core, to the gimpwire, the pipes via which the communication takes 
place.


PDB

First, a brief description of PDB's workings for those who don't know
it too well:

The basis of PDB is "Procedures". A procedure is an opaque object that 
can be called anywhere from the core process or from extensions, and
the PDB relays the invocation to correct destination.

A procedure is identified by its name, a string. Its attributes are
its "type", which defines how it is called, a parameter list, a return
value list, some documentation strings, and a couple of gimpy
parameters specifying where and how it can be called.

The type of a procedure specifies whether it's an "internal"
procedure, ie. in the address space of the gimp core, a "plug-in" or
"extension", which are invoked by forking and execing a specific
program and then relaying the call to it, or "temporary" which means
that the procedure call will be relayed to an already existing process.

The parameter and return value types of a procedure can be chosen from
a brief list of types, which include basic integers, strings, arrays
of these, and representations of several of gimp's data types, usually
as simple opaque enumerations. 

These parameters are represented as straightforward type+union
structs. They are passed between processes via the gimpwire, being
marshalled rather straightforwardly into the pipes.

When the PDB receives a procedure call, it looks up the procedure
definition from its internal hash table (which is the PDB proper) and
checks the parameters for correctness.

If the procedure is internal, the parameters are simply passed to the
implementation of the procedure (there is no automated PDB parameter
-> C parameter marshalling like in, say, GTK). Since the core is not
multithreading (well, actually, nowadays it is, but only to smoothen
up tile swapping), everything blocks until the implementation is
finished.

If the procedure is temporary, it has an existing gimpwire connection
associated with it. The PDB parameters are marshalled via the gimpwire 
pipe to the implementing procedure, and the libgimp routines at the
other end relay the call to the appropriate C procedure at that end.

If the procedure is a plug-in or an extension, it has a filename
associated with it. That file is fork+execed, and it receives two
pipes from the parent via which it communicates. The plug-in's libgimp
routines establish a proper gimpprotocol connection over these
pipes. Then the invocation is sent via the newly-established wire, and
again libgimp makes sure the proper C implementation is called. (Well,
not exactly.. although libgimp maintains a name-function hash table
for temp procs, all plug-in and extension procs are relayed to the
same function, usually named "run", and so in an extension
implementing multiple procedures you are likely to see a big set of
name switches in the start... another of libgimp's inconsistencies...)

Now, when the connection has been established and the remote
implementation is processing a PDB procedure call, it can use the
gimpwire to communicate back. It can make PDB calls back, and also
transfer image data, which is either marshalled via the wire, that is,
the pipes, or, where supported, image tiles can be passed in shared
memory.

The PDB procedure table gets its initial internal procedures when the
gimp app starts up. The initial plug-in and extension procedures it
gains while it queries the programs in the plugin path at
startup. Some procedures are marked as start-up extensions and they
are invoked immediately. The script-fu script server is like
this.


There are several disadvantages to the pdb, most of which stem from
its very design.

Firstly, the actual programming interface is very confusing. Some
examples:

* An implementation always must take an implicit additional parameter
  that comes before those that the caller passed: a "run mode", that
  tells how the procedure was called.

* An exception to the previous: start-up extensions get truly zero
  parameters.

* If an implementation is called "interactively", some of the
  parameters it receives are defined, some are not. Specifically, the
  run mode is always defined, and if the procedure is declared to
  operate upon an image, the two succeeding (compulsory) arguments,
  image and drawable, are also defined. The rest of the parameters,
  though passed, have no meaningful values.

* An array parameter does not hold size information, but the previous
  parameter is implicitly assumed (read: must) to be an int that has
  the size. Even if the array is always of the same size. And the
  caller must always pass this additional parameter.

* The implementation must always return one additional return value
  that comes before those that it declared as returning: a "status"
  which indicates whether the procedure was run succesfully or not.

Second, the communication protocol itself leaves something to be
desired:

* All calls from the extension are synchronous. That is, it blocks
  until a call returns. This means that a single extension cannot
  serve multiple clients simultaneously. (Which forces me to rely on
  nasty hacks to get sensible startup times for gimple)

Libgimp has lots of deficiencies as well, but I'll not specify them
here, because they can at least be fixed without breaking anything.

In summary, I think PDB has too many faults to go on using it much
further. Some of them could be fixed without breaking the current
interface, no doubt, but that doesn't seem sensible, because there is
demand for a more sophisticated system and there indeed is a better
alternative already existing. (All right, ORBit isn't ready yet, but
it probably will be by the time this project is started..)



CORBA

CORBA is the answer to everything. Really. Upon looking at PDB and
gimpprotocol, I get the feeling that S&P hadn't heard of CORBA when
they designed the new PDB. Or then they had, verily, but there just
weren't free C-based ORBs at the time. :)

Anyway, CORBA provides almost direct equivalents for everything that
PDB provides. Only in a much much more general fashion, and in a
standardized way.

CORBA provides parameter marshalling very similar to that of
gimpprotocol. Only it offers much more types. Also, it takes care of
marshalling the parameters all the way to a C function (when the
interface is known at compile time), so it should be easier to program
in (no, really!). CORBA is object-oriented, so it fits better gimp's
new gtk object system based design.

The PDB proper, which holds run-time dynamic information about
procedures, corresponds to the Interface Repository which, well, does
the same thing. Actually, I'm not sure if the IR provides
human-readable documentation about interfaces and methods, but I'm
sure that can be arranged.

CORBA even has facilities for the "run program on demand" style
invocation that is currently used with plug-ins and extensions.

The PDB as it stands currently could be just translated directly into
a "procedure interface", so it is possible to provide compatibility
wrappers for 1.* extensions in the new system. That, I think, won't be
the first priority, though.

The obvious, straightforward way of utilizing CORBA is to export
functions (well, methods, really) in the core to other processes
directly. I'm not really sure about extensions. I think they can be
made into objects which have a "run" method (and also, hopefully, a
"stop" method.. though IIRC CORBA doesn't dictate it, I think a
process should listen for and perform incoming calls even while it is
waiting for a call of its own to return).

There are performance considerations, of course. CORBA is likely to be
somewhat slower than gimpprotocol. However, I expect the exported
operations to be relatively high-level, so there won't be too many
calls.

One cannot do a global "change pointers to object references"
translation in the core code, for that matter. I think that a sensible
compromise is to have separate wrappers to most of the high level
operations (such as on images and layers), which require and check
that all its object arguments are local to its process, then retrieves
the implementations (the actual object pointer), and passes those to
the actual routines. Currently, only the gimp core actually implements
images and layers, so this is not much of a restriction.

However, the idea is to make most of the non-UI code in gimp into a
library (or a couple), and this would allow extensions to actually
implement images and stuff in their own address space. I have no idea
whether this is useful, or how much trouble the "local objects only"
constraint will cause then. (Well, you can always copy the image data
completely..)

Which brings to another point: mass data transfer. Extensions,
especially filter plug-ins, require the moving of lots of image data
back and forth. Currently, this is relatively fast, because shmem is
used. Although CORBA by default supplies only a socket-based
communication medium, I think we can either use shmem by hand, or
(preferably) write a shmem-based GIOP implementation for ORBit.

Alternatively, the extension could make a local copy of the entire
image it operates upon and use the gimp operations locally (from a
lib) on it.

The place where data transfer latency is absolutely essential is in
brushes, because these have to be completely interactive. The idea is
to make these distributed as well. I'm not sure whether it's possible,
even with shmem, to achieve sufficient transfer speeds between address
spaces. There has also been an idea of dynlinking the brushes to the
core (here "core" probably meaning the process implementing the
GUI). This wouldn't be a very portable solution, (though in systems
that don't provide dlopen, this would simply mean the set of brushes
is fixed at compile time, not a horrible restriction). Also, I don't
know how this would interface with CORBA. Brushes are supposed to be
scriptable, too, so they have to have some sort of external
interface...

Then there's the question of how to implement all this. I think we can
assume that (in the core at least) CORBA interfaces correspond to GTK+
classes. Creating automated wrappers would not be impossible. It is
also the idea to autogenerate the gtk stubs (all right, skeletons)
too, so these might be combined somehow, using one source file from
which the IDL (and from that, CORBA stubs and skeletons), and the
implementation's C headers, and some of C source (there's lots in the
implementation of a gtk+ object that can be automated), and the
wrappers between the CORBA skeleton and the gtk-based C functions are
all generated.

As for schedule: I think this is the thing that defines when we are at
2.0. Don't ask me when we should start thinking of that. Anyway, I
think that when this is started, the first thing to do would be to rip
off PDB, abolish libgimp. So plug-ins and extensions won't work at
that phase.

Given good autogeneration, the CORBA wrappers themselves won't need to 
be written by hand. What needs to be done is to make as much of gimp's 
core as possible modular and gtk-based first.

Of the actual CORBA-specific code that needs to be written I have no
idea at this time. Estimates, anyone? Starting extensions on demand
needs a special AdapterActivator, I think, and providing
human-readable run-time documentation might need some code too. I have 
no idea where to place the IR, in the core, in its own process, or use 
a global ird or something? Then, for writing extensions, there needs
to be a replacement for the current libgimp, which has lots of
convenience functions for eg. contacting the core, and maybe a varargs 
function for easier dynamic invocation.. IIRC, the C mapping has no
easy varargs stuff, but dynamic invocation needs all the tedious
CORBA::Request stuff.. There's probably lots more, but I hope someone
with actual experience with CORBA will tell what...


That about sums all I have to say. At least I'm too tired to get
anything else on my mind right now..

If I get no comments I'll become a solipsist!


Lauri Alanko
la@iki.fi



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]