Re: [Tracker] PATCH - Refactoring albumart fetching



On Mon, 2009-04-13 at 18:08 +0200, Carlos Garnacho wrote:

I've just read the patch, here are a few comments:

- tracker_albumart_copy_to_local_if(): The !HAVE_HAL case seems
confusing, mounts in /mnt and /media could not belong to removable
devices at all. Also, having different memory management for
removable_roots items depending that condition looks error prone for
modifications in the future.

This isn't being changed by this patch. Perhaps caused by a lot of
changes in the files is the diff including the function again.

- In that same function, you're running strlen() through the 2 strings
about to be compared, something that strncmp wouldn't need to do, so it
looks a bit superfluous to me.

This isn't being changed by this patch. Perhaps caused by a lot of
changes in the files is the diff including the function again.

- local_uri = g_strdup_printf ("%s/.mediaartlocal", local_dir); Please
use g_file_get_child() there. There are several examples of this all
along the code.

Right, fixed

- the "local_if" parts in function names are sort of meaningless, same
for the DBus protocol.

You have any proposals for the names?

- Is there any need to repeat the strcasestr() function prototype? This
should be already provided by string.h

No, if you #define GNU_SOURCE. I have fixed this.

- in on_queue_timeout(), why is there a need to check whether the source
was already destroyed? are we multithreaded there?

Programming practice. I try to always do this as it's the right way to
do a callback if your GSource can be g_source_remove()-ed.

- tracker_extract_emit_scan_for_albumart() and
tracker_extract_emit_copy_to_local_if(): It feels quite weird not having
the object as a parameter here, even if it's a singleton. Same goes for
signal slots in TrackerExtractClass, gobject quite expects the first
parameter to be the object there.

I have fixed the TrackerExtractClass prototype as discussed on IRC.

- I'm not sure I really like having tracker-indexer to be exposed to
this kind of "implementation details". It has a generic conception of
files and files as containers, but it must know about several files in a
"supercontainer" to treat them as an album.

This is implemented by simply having a "30 seconds for collecting" time:
The assumption is made that a request for a heuristic scan for all music
inside an album is made within 30 seconds. If not, it's as far as
Tracker is concerned an all-new heuristics scan request.

This trick means that Tracker's indexer needs no notion of "a generic
conception of files and files as containers, several files in a super
container, etc etc".

In general, I'm pretty hesitant to commit this to 0.6.x, it's a fairly
big change, and could bring stability and behavior changes just for
improving performance a bit.

Problem is that the performance issue is a top-1 bug that we have at
this moment. Mikael told us last week that the heuristic scanner is a
top contributor for the performance problems.

This patch will I believe fix it.


On lun, 2009-04-13 at 14:49 +0200, Philip Van Hoof wrote:
Changes since last patch:

  o. Don't request a heuristic scan if the media file has no album id3
     tag.

On Mon, 2009-04-13 at 14:30 +0200, Philip Van Hoof wrote:
I continued working on this today, so here's a better version.

Changes since last patch:

  o. Requests to copy-to-local are now also handled by the indexer,
     instead of by the extractor
  o. Heuristic scanner finds local media art first, and copies from it
     if found as first heuristic match
  o. Proper check in configure.ac for strcasestr 
  o. There were linking problems in the test-tools for tracker-extract,
     I fixed them in this patch (by providing dummy functions)
  o. Restored all necessary locations for requesting thumbnails of the
     newly created album art
  o. Eliminated the need for TrackerHal in tracker-extract (which will
     make Ivan as happy as a little boy who has just found Easter eggs),
     removed the construction of it from the main function and from the
     tests for tracker-extract.
  o. Cleaned up use of TrackerHal in the code that now runs in the
     indexer, it all uses hal from private->hal now instead. Made it
     cope with not HAVE_HAL too.

I'll notify you tomorrow about it if I change more things today, Martyn.
But feel free to start reviewing this one already.


On Fri, 2009-04-10 at 17:40 +0200, Philip Van Hoof wrote:
Hackers!

Currently Tracker's media file extractor will on top of trying to
extract embedded media art from files, perform a heuristic on the
directory of the media file.

The reason why it does this heuristic is to find album art related to
the media file.

Usually are media files grouped together in a directory, though. For
example like this:

/location/Freaky Freak/The greatest hits/Freaky-Freak_-_The-greatest-hits_-_I-can-sing.mp3
/location/Freaky Freak/The greatest hits/Freaky-Freak_-_The-greatest-hits_-_You-can-sing.mp3
/location/Freaky Freak/The greatest hits/Freaky-Freak_-_The-greatest-hits_-_We-can-sing.mp3
/location/Freaky Freak/The greatest hits/Freaky-Freak_-_The-greatest-hits_-_They-can-sing.mp3

Or whatever filenaming format the w3ird peoples on thiz planetz have
used for their albums, songs, media files and etcetera.

Right now will the extractor foreach file in such a directory repeat the
complete heuristic.

With this patch we'll throw the request for a heuristic back to the
indexer. The indexer has a queue where it also merges the requests
together. In the patch the indexer does this by forming a key using the
dirname, album and artist.

Which means that if a request for a heuristic is made for ten files sang
by the same artist, grouped into the same album, stored in the same
directory, which is kinda common by the way, I'm again making my
sentences too long, making it hard for people to read them, then the
indexer will simply merge those ten together and perform just one
heuristic instead. Saving precious computar cycles.

It's quite a refactor, and this close to a next 0.6.x release it would I
think be good if a few extra couple of eyes would review and test the
code before I push it.

Please review



-- 
Philip Van Hoof, freelance software developer
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
http://pvanhoof.be/blog
http://codeminded.be

Attachment: albumart-refactor4.diff
Description: Text Data



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]