Re: [Shotwell] Repairing the event field in the database

There is a serious problem with the script that I posted in this list, which also appears to be the cause of the database problems that I have had. The problem is the row:

photo_df.to_sql("PhotoTable", con, if_exists='replace', index=False)

which will cause the PhotoTable to have the wrong "schema". The problem is that the Id variable will not be a "primary key", which will cause it not to autoincrement when inserting new photos into the database, and instead get a NULL value.

pandas provide an easy way to remedy this by the dtype keyword that was introduced in pandas 0.16.0:

dtype = {
  "filename": "TEXT UNIQUE NOT NULL",
photo_df.to_sql("PhotoTable", con, if_exists='replace', index=False, dtype=photo_dtype)

Imo, it is a bug that shotwell does not validate the schema of the database, when opening it. It would have saved me plenty of time if it had...

I hope to create a github repo in the next couple of days with scripts for directly accessing the shotwell database.


On Wed, Jun 10, 2015 at 10:48 PM, Dov Grobgeld <dov grobgeld gmail com> wrote:
Here is my solution to the event field illegal value. Again, I don't why the database was corrupted, but after the running the script below, I can once again browse my photos by date. I assume that next time I import photos, the database will again be corrupted, but at least now I know to repair it.

Perhaps the script below is usable to someone else. I plan to create a github repo with all my scripts. What is nice about accessing the database directly with scripts is that you can do queries that currently can not be done in shotwell. E.g. "Create a new tag 'joe-cat' that contain all images taged with 'joe' AND tagged with 'cat' that are less than 3 years old'.



# Update the event field in the shotwell database.

import sqlite3, pandas, json, time, datetime, os

def id_to_thumb(id):
  return 'thumb%016x'%id

def thumb_to_id(thumb):
  return int(thumb.replace('thumb',''),16)

con = sqlite3.connect('/home/dov/.local/share/shotwell/data/photo.db')
photo_df = pandas.read_sql("SELECT * from PhotoTable", con)
event_df = pandas.read_sql("SELECT * from EventTable", con)
thumb_path =  '/home/dov/.cache/shotwell/thumbs/thumbs128/'

# Create a fast lookup from an event to its timestamp.
timestamps = photo_df[['id','timestamp']].set_index('id').timestamp

date_to_event = {}
event_to_date = {}
thumbs = []
for idx,row in event_df.iterrows():
  thumb = row.primary_source_id
  event_id =
  if not 'thumb' in thumb:

  thumb_fn = thumb_path + thumb + '.jpg'
  if not os.path.exists(thumb_fn):
    print 'Oops! didn\'t find thumb
​ ​

  photo_id = thumb_to_id(thumb)
  if not id in timestamps:
    print 'Oops! Failed to find ' + str(photo_id)
  timestamp = timestamps[photo_id]
  t = time.localtime(timestamp)
  thumb_datetime =,t.tm_mon,t.tm_mday)
  date_to_event[thumb_datetime] = event_id
  event_to_date[event_id] = thumb_datetime
  thumbs += [thumb_fn]

def fix_event(row):
  global date_to_event
​    # Check if the event_id has not been set​

  if row.event_id==-1:
    t = time.localtime(row.timestamp)
    ph_datetime =,t.tm_mon,t.tm_mday)
    if ph_datetime in date_to_event:
      return date_to_event[ph_datetime]
    return -1
  # Default to the old event id
  return row.event_id

# Fix up the events
photo_df.event_id = photo_df.apply(fix_event,axis=1)

# Commit
photo_df.to_sql("PhotoTable", con, if_exists='replace', index=False)

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]