[Banshee-List] GSoC idea, finding duplicates



Hi everyone, first post here on the Banshee list and I'm looking for
some opinions and/or advice.

I've never done any work on a real project before, but now that I'm
taking Software Engineering at school and I've learned more about
developing in large projects, I feel it's time to finally start
contributing to Open Source.  I've been looking at some of the ideas
for GSoC, but I haven't seen something that really attracts my
attention.  However, there's a feature that I've been wanting out of
my media player for a long time.  I don't know if this kind of feature
has been discussed before, so I'd like to hear the Banshee developers
take on this.

So I have a fairly large music collection with about 22,000 songs.
However, I have LPs, EPs, and Greatest Hits albums for many artists.
This causes a lot of duplicate songs in my library, and with so many
songs it's very hard to check manually.  My idea is to implement a
scan for duplicate music files in the library.  Now, this can be done
in many different ways.  The simplest, and fastest, is by checking
things such as artist name, song title, and song length.  Using those,
possible duplicates can be shown to the user and they can delete the
ones they don't want.

A better, but longer, way would be using fingerprinting.  I've been
looking at the Echoprint and Chromaprint (with Acoustid) which could
be possible candidates.  Since these are also integrated with
MusicBrainz, it seems like other possibilities could open up for
Banshee - MB integration.

So, have any of these ideas been discussed for Banshee?  Does it seem
like something that could go well with Banshee?

If it seems like a plausible project and if someone would be
interested in mentoring it, I would love to discuss it further before
starting on a formal proposal.

-- 
Diego Fernandez - 爱国


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]