Re: [Evolution-hackers] EBookBackendSqliteDB comments
- From: Chenthill <pchenthill gmail com>
- To: sean finney <seanius seanius net>
- Cc: evolution-hackers gnome org
- Subject: Re: [Evolution-hackers] EBookBackendSqliteDB comments
- Date: Mon, 09 May 2011 09:01:20 +0530
On Fri, 2011-05-06 at 10:01 +0200, sean finney wrote:
> Hi Milan,
>
> On Fri, May 06, 2011 at 08:56:10AM +0200, Milan Crha wrote:
> > > As I already said seanus on irc, I will be evaluating the performance
> > > between having vcards as files Vs having it in db and then choose the
> > > one which would be best. So the code for both will be there and we can
> > > choose between them over after testing. I was also thinking of providing
> > > it as an option for the backends to choose once i complete the testing..
> > > So what we discussed stays the same :)
http://git.gnome.org/browse/evolution-ews/tree/src/addressbook/e-book-backend-sqlitedb.h
has the API's. Meta-data apis is work-in-progress.
> >
> > This is not only about performance, my main concerns are these:
> > a) if something fails with db file, user's data are safe
>
> > b) users can take their contacts anytime and import them on another
> > machine, in case of hard disk crash, partial backup or anything like
> > that
>
> I think we should stop and consider two different motivations for this
> API. (1) Local addressbook (2) Local cache of remote addressbook. For
> case (1), I agree that having the items split out could be useful and
> a good safeguard against any db corruption (though my experience thusfar
> with sqlite is fairly positive).
>
> For case (2), I would say if there's a problem with the file just nuke
> it and reload it from the remote store. Since you can guarantee that
> you can get a "working copy" of the info, you can then rely on the existing
> UI (or sqlite, or the remote service, or whatever), for exporting the
> contacts. It is a *cache* after all :)
>
> So for something like GAL (or any cached-from-remote addressbooks),
> I think it makes a lot of sense to *not* split out the contacts, at
> least as long as performance doesn't suffer by having more items in
> the sqlitedb file.
I wanted to check the performance on the address-books which has huge
data in them between the two methods and choose the best which suits.
If it turns out that there is a big difference between the two, i would
document that and allow a choice for the backends to choose how they
want to store the data.
>
> > c) folders.db files tend to grow "indefinitely". That's another point
> > why I do not like "one file per account".
>
> I'd like to clarify a detail of the API from having looked over it wrt
> evo-mapi: it's designed so that it can be used "one file per account", by
> creating a single db file and specifying the "folder" as an API parameter
> in all calls.
>
> But this means you could always create multiple db instances at different
> file locations, one per folder, and just use a junk "FOLDER" (or similar)
> name for the folder. Having looked over the current evo-mapi code, I
> think you'd want to do soemthign like that.
>
> Of course if you think that there should *never* be a cas where it's used
> one db per account, then rethinking the API would make sense, but otherwise
> nothing lost by keeping it, it gives you a way to do both.
I have made it configurable. So the clients can choose to save all the
address-books in one db or provide different paths so that they would be
stored in different db files.
>
> > An example: my evo-mapi account has 4 addressbooks (one is GAL). I would
> > really prefer to have them separated, not in one large file. Not talking
>
> And that should be possible, see above.
>
> > about possible (even unlikely) UID clashes between separate
> > addressbooks. Will it also mean that each local addressbook will be
> > stored in one large db? Please do not do that.
>
> The underlying db should deal with stuff like UID clashes, agreed. I
> think the current API does so, though I'm not convinced it's the best
> way. Currently, you have:
>
> const gchar *stmt = "CREATE TABLE IF NOT EXISTS folders \
> ( folder_id TEXT PRIMARY KEY, \
> folder_name TEXT, \
> sync_data TEXT, \
> bdata1 TEXT, bdata2 TEXT, \
> bdata3 TEXT)";
>
> stmt = sqlite3_mprintf ("CREATE TABLE IF NOT EXISTS %Q \
> ( uid TEXT PRIMARY KEY, \
> nickname TEXT, full_name TEXT, \
> given_name TEXT, family_name TEXT, \
> email_1 TEXT, email_2 TEXT, \
> email_3 TEXT, email_4 TEXT, \
> vcard TEXT)", folderid);
>
> which AIUI means a table named after every folder. Therefore the UID's
> are already internally partitioned and will not conflict. WRT normalizing
> the database, I would suggest something more like:
>
> const gchar *stmt = "CREATE TABLE IF NOT EXISTS folders \
> ( folder_id TEXT PRIMARY KEY, \
> folder_name TEXT, \
> sync_data TEXT, \
> bdata1 TEXT, bdata2 TEXT, \
> bdata3 TEXT)";
>
> stmt = sqlite3_mprintf ("CREATE TABLE IF NOT EXISTS contacts \
> ( folder_id INT,
> uid TEXT, \
> nickname TEXT, full_name TEXT, \
> given_name TEXT, family_name TEXT, \
> email_1 TEXT, email_2 TEXT, \
> email_3 TEXT, email_4 TEXT, \
> vcard TEXT,
> PRIMARY KEY (folder_id, uid) )" );
>
On address-book deletion, dropping a table is far better than querying
and deleting all the contacts that matches a folder id. But the
frequency of deleting address-book's may be less.
So I went for a quick search and found this,
http://stackoverflow.com/questions/784173/what-are-the-performance-characteristics-of-sqlite-with-very-large-database-files
which shows using mutltiple tables is better. I have not personally done
any tests regarding this.
> As an extra bonus that means you could do autocomplete type
> queries in a single SQL query.
AFAIK with the current design of eds, each address-book would be queried
separately and would not benefit by this.
- Chenthill.
>
>
>
> sean
> _______________________________________________
> evolution-hackers mailing list
> evolution-hackers gnome org
> To change your list options or unsubscribe, visit ...
> http://mail.gnome.org/mailman/listinfo/evolution-hackers
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]