metadata update



Hi,

I'm quite busy right now preparing for my last exam before the diploma
thesis but still working on the metadata side of beagle. So i thought i
might share what i've been doing lately with you to hear what you think
about it.

I changed the whole Sqlite part to use IDs refering to tables holding
the uris/values etc. mainly in order to safe disk space.
Now a lot of the queries turned into select statements on multiple
tables:
	SELECT ... FROM instances AS i, statements AS s ...
	WHERE i.id = s.instance_id AND i.uri = "blablabla" ...

Most of this is done in sqlite and i think that that has a huge
performance benefit over having different select statements like this:

uint id = ... "SELECT id from instances WHERE uri = "blablabla"

	SELECT ... FROM statements ...
	WHERE instance_id = {0}...", id ...

Internal sql optimization should do the projection resulting of the 
i.uri = ... where-clause early in order to optimize performance. So
there should be no performance problems resulting but we still save time
because we don't switch twice between the sqlite context and the beagle
context.

Some of the selects could be avoided if the IDs of the different
entities would be stored within the objects we use in beagle.
If for example beagle Properties would also contain ID fields for their
values fetching the id from the instance table would not be necessary if
one wants to use that value later (for example in order to show more
properties of the contact that send an email).
I still decided to keep the whole beagle - sql interface string only
based for two reasons:
- encapsulation of the sql implementation - maybe we will use different
backends at some point that don't use IDs like this one.
- problems with ID changes - so far all IDs autoincrement. So removed
objects loose their ID but new ones don't use that old ID. We might have
to add some optimisation later on to reuse old IDs. This would involve
possible change of IDs so that we might run into consistency problems
when the IDs are used in beagle itself. 

Considering the state of the project... i am advancing way slower than
during SoC but i still see some things improving. I am working on the
store side of things only right now using unit tests to test it but i
have not plugged it into beagle for some weeks now. I do have some code
running that does find links between objects based on the metadata
modell.
For example indexing a mail from someone who is in the address book
would create a link of the mailfrom statement to the address book entry
at indexing time.
I hope to clean that up and test it with beagle in the next week.

How about the other SoC Projects? I'd love to see networked search work.
We got some computers running linux here in our flat so a distributed
search would be really cool. And where can i get the current version of
dashboard? Don't think i will have time to really get into it but i'd
love to just take a look at how it works so far...

Max






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]