[Tracker] Script for media library generation



Hi list,

I'm currently working on a script which I think some of you may find
interesting. The script is for generating media files with varying
degrees of meta data fields filled in. The background to this script
is that I found it difficult to build larger media databases with
media gathered from the web. There are some repositories with media
files released under permissive licenses, but;

1. Exploiting the bandwidth of these repositories by downloading large
portions of them is bad
2. Often very many meta data fields are missing the the files
available online. While this gives a realistic view of the meta data
in actual media files, it is difficult to gather files with "perfect
metadata" from the web.
3. If a suitable repository, which permits you to use their bandwidth
is found, the actual transfer of the files likely takes a long time
4. Sharing the database you've built with someone else means the files
must be transferred to all parties

OK. So I think I have pitched the problem now. What I have done is to
combine media encoders (LAME and ImageMagick) and metadata tagging
software (id3v2 and exiftool) with the random number generator of
Python. By using the random numbers generated by Python as input to
these tools, random, reproducible (by reusing the seed for the PRNG),
media files can be created.

I'm using this to create large numbers of media files to test Tracker
extractor modules on, and it works pretty well. So far I can generate
MP3, PNG, JPG, TIF, and GIF.

If anyone wants to have a look, I've put the script here:
https://github.com/Pelagicore/mlg
I've also put a sample run here:
https://github.com/Pelagicore/mlg/wiki/Sample-run

I should say that the script is far from finished, and probably pretty
buggy. Use with care :)

-- 
Regards,
Jonatan Pålsson

Pelagicore AB
Ekelundsgatan 4, 6th floor, SE-411 18 Gothenburg, Sweden


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]