[Evolution-hackers] 32 bit IDs in contact file backend



Hello!

As part of wrapping QtContacts around EDS [1] I ran into the same issue
that Nokia already encountered in their Maemo 5 [2] backend: EDS uses
strings as ID, QtContacts 32 bit integers.

Nokia solved that by setting up an in-memory hash which maps the UID
string to its CRC-16. I don't see any checks for the (possible) hash
collisions. IMHO a proper solution has to keep a permanent mapping on
disk, otherwise the 32 bit IDs won't be stable.

Overall not a nice solution. My attempt to make it nicer at least in
combination with the file backend (the main goal for QtContacts-EDS)
focused on simplifying the problem instead: with 32 bit IDs in the
storage, the mapping becomes easy.

Attached the resulting patch. Note that with the patch applied, all new
contacts in a Berkley DB get the simpler IDs, unconditionally. Older
contacts continue to use their existing IDs. Would something like this
be acceptable upstream?

Further work if you agree in principle:
      * let clients query whether all contacts have the simplified ID -
        could be done with the dynamic capabilities that I mentioned in
        the e-client API thread; avoids reading all contact (UIDs)
      * optional: migrate database to simplified IDs on request

I'm not sure whether the second point really needs to be in EDS.

Obviously it should only be done when the impact on the rest of the
system is understood. For example, in a database that is synchronized
with SyncEvolution, such a rewriting of IDs will result in removing and
re-adding all contacts in the synchronized peers because that is what
it'll look like to other EDS clients.

This is also how the migration can be implemented without changes in
EDS: just remove and re-add all contacts.

[1] https://meego.gitorious.org/meego-middleware/qtcontacts-eds
[2] https://www.qt.gitorious.org/qt-mobility/contacts/trees/master/plugins/contacts/maemo5

-- 
Bye, Patrick Ohly
--  
Patrick Ohly gmx de
http://www.estamos.de/

>From 65438bd3c4b706535f83304d94b3c018c8899a5d Mon Sep 17 00:00:00 2001
From: Patrick Ohly <patrick ohly intel com>
Date: Fri, 18 Mar 2011 16:07:23 +0100
Subject: [PATCH] addressbook file backend: use 32 bit integers as IDs

The motiviation for this change is primarily that it'll simplify
wrapping EDS in QtContacts, which only supports 32 bit integers for
the storages that it wraps.

Previously the ID was generated as a combination of time and running
counter, without checking for conflicts. This patch changes this such
that the ID is only a running counter. ID conflicts are detected during
an attempt to save, leading to retries with the next counter value.

The "for future use" members of _EBookBackendFilePrivate are removed
because they serve no useful purpose in a structure which is allocated
and used only locally.
---
 addressbook/backends/file/e-book-backend-file.c |   48 +++++++++++++++-------
 1 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/addressbook/backends/file/e-book-backend-file.c b/addressbook/backends/file/e-book-backend-file.c
index dc6a703..d21c527 100644
--- a/addressbook/backends/file/e-book-backend-file.c
+++ b/addressbook/backends/file/e-book-backend-file.c
@@ -77,11 +77,7 @@ struct _EBookBackendFilePrivate {
 	DB       *file_db;
 	DB_ENV   *env;
 	EBookBackendSummary *summary;
-	/* for future use */
-	gpointer reserved1;
-	gpointer reserved2;
-	gpointer reserved3;
-	gpointer reserved4;
+	guint     id;
 };
 
 G_LOCK_DEFINE_STATIC (global_env);
@@ -169,13 +165,13 @@ build_summary (EBookBackendFilePrivate *bfpriv)
 }
 
 static gchar *
-e_book_backend_file_create_unique_id (void)
+e_book_backend_file_create_next_id (EBookBackendFile *bf)
 {
-	/* use a 32 counter and the 32 bit timestamp to make an id.
-	   it's doubtful 2^32 id's will be created in a second, so we
-	   should be okay. */
-	static guint c = 0;
-	return g_strdup_printf (PAS_ID_PREFIX "%08lX%08X", time(NULL), c++);
+	/* use a 32 counter, avoid 0; collision checks must be done by caller */
+	++bf->priv->id;
+	if (!bf->priv->id)
+		bf->priv->id = 1;
+	return g_strdup_printf (PAS_ID_PREFIX "%04X", bf->priv->id);
 }
 
 static void
@@ -210,7 +206,8 @@ do_create(EBookBackendFile  *bf,
 	g_assert (vcard_req);
 	g_assert (contact);
 
-	id = e_book_backend_file_create_unique_id ();
+ retry:
+	id = e_book_backend_file_create_next_id (bf);
 
 	string_to_dbt (id, &id_dbt);
 
@@ -224,9 +221,10 @@ do_create(EBookBackendFile  *bf,
 
 	string_to_dbt (vcard, &vcard_dbt);
 
-	db_error = db->put (db, NULL, &id_dbt, &vcard_dbt, 0);
+	db_error = db->put (db, NULL, &id_dbt, &vcard_dbt, DB_NOOVERWRITE);
 
 	g_free (vcard);
+	g_free (id);
 
 	if (0 == db_error) {
 		db_error = db->sync (db, 0);
@@ -234,12 +232,16 @@ do_create(EBookBackendFile  *bf,
 			g_warning ("db->sync failed with %s", db_strerror (db_error));
 		}
 	} else {
-		g_warning (G_STRLOC ": db->put failed with %s", db_strerror (db_error));
 		g_object_unref (*contact);
 		*contact = NULL;
+		if (DB_KEYEXIST == db_error) {
+			/* did not guess a new ID, try again */
+			goto retry;
+		} else {
+			g_warning (G_STRLOC ": db->put failed with %s", db_strerror (db_error));
+		}
 	}
 
-	g_free (id);
 	db_error_to_gerror (db_error, perror);
 
 	return db_error == 0;
@@ -1108,6 +1110,7 @@ e_book_backend_file_load_source (EBookBackend           *backend,
 	DB_ENV *env;
 	time_t db_mtime;
 	struct stat sb;
+	DB_HASH_STAT *hashstat;
 
 	dirname = e_book_backend_file_extract_path_from_source (source);
 	filename = g_build_filename (dirname, "addressbook.db", NULL);
@@ -1290,6 +1293,21 @@ e_book_backend_file_load_source (EBookBackend           *backend,
 	}
 	db_mtime = sb.st_mtime;
 
+	/*
+	 * Picking the number of records as base for the next ID avoids trying out
+	 * lots of already used IDs in a database with few deleted contacts.
+	 * Optional optimization, if this fails or returns no information, we simply
+	 * start with zero.
+	 *
+	 * Note that DB_FAST_STAT is not used, because without it we won't get
+	 * hash_nkeys. This makes the stat call slower, but should be worth it
+	 * because we need to access the content sooner or later.
+	 */
+	if (!bf->priv->file_db->stat(bf->priv->file_db, NULL, &hashstat, 0)) {
+		bf->priv->id = hashstat->hash_nkeys;
+		g_free(hashstat);
+	}
+
 	g_free (bf->priv->summary_filename);
 	bf->priv->summary_filename = g_strconcat (bf->priv->filename, ".summary", NULL);
 	bf->priv->summary = e_book_backend_summary_new (bf->priv->summary_filename, SUMMARY_FLUSH_TIMEOUT);
-- 
1.7.2.5



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]