[Tracker] What to do with libtracker-common?



Hello all,

As part of the plan to clean up Tracker and split the project into chunks that make more sense, which we've been discussing here:

https://mail.gnome.org/archives/tracker-list/2014-September/msg00004.html


The problem
-----------
One of the major hurdles is libtracker-common. The main reasons for this are:

1. It's linked to and used by many parts of Tracker, specifically (and not including unit tests):
  libtracker-bus
  libtracker-commonlibtracker_common_LTLIBRARIES
  libtracker-control
  libtracker-data
  libtracker-extract
  libtracker-fts
  libtracker-miner
  miners/apps
  miners/fs
  miners/rss
  miners/user-guides
  plugins/evolution
  tracker-control
  tracker-extract
  tracker-preferences
  tracker-store
  tracker-utils
  tracker-writeback
  utils/mtp
  utils/ontology
  utils/tracker-resdump
  utils/tracker-sql

2. It's internal right now, so we can't just use pkg-config to bridge the gap when we split modules from the current Tracker code base.


Work that has started
---------------------
As part of this work, I've started cleaning up the library in this branch where I am moving code to places where it's only used once (for example) or removing code that's totally unused:

https://git.gnome.org/browse/tracker/log/?h=libtracker-common-cleanup

However, we're still left with a library which has many uses across the code base and here I want to summarise what modules/functions are used by which areas of Tracker to figure out what to do with them.


Breakdown of libtracker-common
------------------------------

-> tracker-config-file.h

   Used for all GSettings/configs, so libtracker-data, libtracker-fts,
   tracker-store, and the miners.

-> tracker-date-time.h

   Almost exclusively used by libtracker-data, except for the last 2
   APIs which convert dates/strings and a lot of modules use those
   functions.

-> tracker-dbus.h

   Used by all dbus based daemons, so all miners and the store.

-> tracker-enums.h
-> tracker-enum-types.h

   Used primarily for log verbosity and sched idle settings because
   multiple daemons use the same logging or scheduiling settings.

-> tracker-file-utils.h

   Used to handle operations like getting file size, mtime, etc and
   also for locking files, calculating disk space remaining, etc. One
   important reason for APIs like tracker_file_open() is to make sure
   extractors open with O_NOATIME and to allow posix_fadvise() use
   consistently.

-> tracker-ioprio.h

   Set the I/O priority to be lower than normal to avoid disk
   clobbering. Used by tracker-extract and the miners.

   Could actually be in libtracker-miner.

-> tracker-keyfile-object.h

   Internal and used exclusively by tracker-config-file.h. So wherever
   that ends up, so should this module.

-> tracker-language.h

   Used for stemming, stop word handling and language codes.
   Fundamental to libtracker-fts and libtracker-data.

-> tracker-locale-gconfdbus.h
-> tracker-locale.h
-> tracker-meego.h

   Used to notify and keep track of local changes which is needed to
   re-create database collations because sorting can vary by locale
   (among other things).

   The -gconfdbus file is an implementation, which we might even be
   able to remove by now? *Is anyone using still using GConf?*

   The -meego file is use to translate and get the locale on Meego.
   *Is anyone still using Meego?* Would like to remove this.

-> tracker-miner-locale.h

   Could be moved to libtracker-miner.

-> tracker-log.h

   Used by all daemons to control logging and verbosity of logging.

-> tracker-ontologies.h

   This contains a bunch of definitions for our ontology, it should
   really be *with* the ontology or part of libtracker-sparql I think.

   I would like to move this somewhere else.

-> tracker-os-dependant.h

   The tracker_spawn*() API here is only used by libtracker-data and 2
   extractors, mplayer and ps (where we use gunzip, which is quite a
   bad way to do this).

   The tracker_memory_setrlimits() is only used by tracker-extract and
   could be moved there directly or put in libtracker-extract.

   The strnlen() addition for OS' that have no implementation of this
   function are used by the MP3 extractor module and libtracker-data.

   All other API is unused and can be removed.

-> tracker-sched.h

   Used exclusively by miners (including tracker-extract which is
   technically a miner too), so could be moved to libtracker-miner.

-> tracker-storage.h

   Used by tracker-miner-fs, tracker-writeback and libtracker-miner.

-> tracker-type-utils.h

   The APIs here are mainly used by libtracker-miner and/or
   tracker-miner-fs. Some are used by libtracker-fts.

-> tracker-utils.h

   tracker_strhex() is used only in debug by libtracker-fts and should
   be moved there really.

   tracker_utf8_truncate() is used only by libtracker-data and should
   be moved there.

   The tracker_is_{empty|blank}_string() API is used by
   tracker-extract, libtracker-extract and nautilus plugin.

   The tracker_seconds*() API is used by libtracker-miner,
   tracker-control and tracker-extract modules and arguably should be
   with the tracker-date-time.[ch] code.


Logical groupings
-----------------
In terms of logical components in libtracker-common, I see the following areas:

A. We have basic or fundamental type functions like string list to list, date conversions and so on.

B. We have indexing specific APIs for stemming, stop words, locale handling, etc. which should be grouped together ideally.

C. We have file system related APIs which are not specific to the tracker-miner-fs because they're needed by tracker-extract, extractors and even libtracker-data in many cases. So more or less a fundamental file system function API - some of which can be stripped back.

D. We have system and performance related APIs (like ioprio, sched_idle, etc). This includes config handling, logging and dbus helper APIs.


Possible solutions
------------------
1. We just make libtracker-common public, perhaps give it a better name (like libtracker-core?) and be done with it. We can then use it anywhere. Advantages: it's quick and easy, Disadvantages: API has to be stable.

2. We split libtracker-common up and move code to more specific areas, e.g. fundamental type functions (A) go into libtracker-data, etc. Advantages: code is more logically placed, Disadvantages: Linking to larger libraries for small API calls might not be so attractive.

3. We create newer smaller libraries with logical parts A to D and. Advantages: Smaller concise areas to group API, Disadvantages: Slightly larger maintenance and distribution burden in the beginning.

4. We copy the code to modules they're needed in. Advantages: Less libraries to link with, Disadvantages: potentially a larger footprint with the same API copied around.


Thoughts?

--
Regards,
Martyn

Founder & Director @ Lanedo GmbH.
http://www.linkedin.com/in/martynrussell


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]