FdL, some interesting questions from Rebecca

From: Wolfgang Mueller <Wolfgang Mueller cui unige ch>
To: gnome-kde-list gnome org
Subject: FdL, some interesting questions from Rebecca
Date: Mon, 19 Mar 2001 16:11:20 +0100 (CET)
Hi,

>>>>> "RS" == Rebecca Schulman <rebecka eazel com> writes:

RS> It would be cool if you could explain what technologies you
RS> already have developed, and some others I haven't heard of.  
RS> If it's not clear, I guess knowing more about your expectations
RS> of how technologies you've been talking about would fit 
RS> into/create a the larger interface you've described.

OK pretty big question.

RS> Specifically, what is the GIFT? What does it do, how does it work, why does it
RS> work?

The GIFT is the GNU Image Finding Tool. It extracts automatically
visual features from images, and translates each image into a chain of
integers (which are IDs of the visual features) with associated floats
(which are weights). The resulting documents are structurally
equivalent to text (I call them pseudo-words making a pseudo-text). This
permits us to index them like text in an inverted file using
statistical weighting methods (those of you who do not know what's an
inverted file, just retain that this is THE way of indexing text).

Now we can take any image, do the same feature extraction on it, and
use this as an inverted file query by example. Note: you give an
example image, you retrieve visually similar images, which are likely
to be semantically similar to the query. We can also give feedback by
marking some of the result images as positive or negative.

This works for some stuff pretty well, for some stuff not so well, in
any case, this is more flexible than fixed hierarchies, and in many
(most?) cases better than looking sifting the collection by hand.

[Just as an aside: credits for the features-to-inverted-file idea go
to David Squire, my ex boss]

RS> Also, what is MRML?

As content-based image retrieval (this is the term for what the GIFT
is doing) is an area of active research, there is the challenge to
find a client-server communication protocol that is apt to be extended
in unforeseen directions. MRML is an XML-based communication protocol
designed with this purpose in mind. MRML is an XML-wrapper for about
all kinds of query languages you might imagine.

In the future, MRML is likely to be used by quite a number of CBIRS
researchers. We are just starting to get partners, so I think things
are well under way.

The advantage for FdL in using MRML would be, that if we make a really
good and flexible query engine, this engine could be compared to other
systems easily, and in the same way, it would be easy to incorporate
"foreign" query engines. 

In the short term: the GIFT already uses MRML, so why not keep that?

RS> You are right to see that I haven't been very interested 
RS> in statistical searching methods. I have done some work with 
RS> them, and my undergraduate thesis was actually on using statistical
RS> language information from corpora to find relevant documents.  
RS> I'm not philosophically opposed to them, but I want to create 
RS> technology that is easy to use, and one of the criteria of this, 
RS> in my opinion, is that its behavior be something that you can
RS> become more familiar with over time.  I haven't really
RS> seen statistical methods that meet this criterion.  I'd be 
RS> glad to hear your feelings on the matter.

I think rule-based stuff is good for constrained areas. E.g. if you
constrain yourself to the english language, and maybe a certain subset of
it. (Or, to give an image processing example: if you limit yourself to
images of soccer games etc.) . In these cases trying to understand
completely the given image/text will get you very far. In other cases,
statistical information processing will do cheating which is more
successful than "serious" trying to understand.

In any case, true image understanding is still unheard of in this
world. So in this area we are limited to statistical methods.

I do not think it is feasible to give perfection, at the current
state. However, we can do things which are *far better* than doing
nothing at all, and we can do things which are better than any desktop
environment has seen so far.

--
To come back to the smaller question that is interesting for you and me:

Medusa/GIFT/Fdl??

The GIFT is done explicitly for having multiple query processors
evaluate the same query and then merge the resutlts. This should work
with about anything. So a paradigm mix might create some work but
would be rather a proof of concept of the architecture.

--
The big question or: how do I imagine things:

Konqueror/Nautilus (in alphabetical order) are able to present
documents by their thumbnails etc. . This is information we could
use. 

What I would like to in the beginning would be a:

"Find similar documents" 

popup menu item in the file manager. If this is chosen, the URL of the
image will be sent to the GIFT via MRML, the GIFT will retrieve
similar images (or texts given a text retrieval engine), the file
manager will show them as thumbnails, (possibly with scores), and it
will give the possibility to enhance the query using relevance
feedback (marking images positive and negative), leading to other
thumbnails with other scores.

Imagine the same stuff in the GNOME/KDE file-open dialogue, and in the
"recently used files" menus of applications and of the desktop.

Of course, it would be good for the search engine to be notified of
saved files, if we do not want to do a find every now and then.

This would create some work for the desktop people, but a large chunk
of the work would be to make the GIFT flexible enough to do queries on
any kind of multimedia data, and not only on images. A first step
might be to let the GIFT and a text retrieval engine live "side by
side" and merge the results, but I guess in the long term it would be
best to have a search engine which tries to combine the maximum amount
of cues in one search process. 

Cheers,
Wolfgang

-- 
Wolfgang M&uuml;ller, 
assistant-doctorant ==  PhD student (2001), teaching assistant
Personal page: http://cui.unige.ch/~vision/members/WolfgangMueller.html 
Maintainer, GNU Image Finding Tool (http://www.gnu.org/software/gift)
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]