Re: [Tracker] musepack & monkey audio files treated as text/plain



Rob Til Freedmen wrote:
trackerd treats musepack & monkey audio files as 'mime text/plain and service type 6'

 From the tracker.log:
13 Dec 2006, 16:21:04:379 - Extracting Metadata for *new* file /mnt/sound/Oldies/John.Peel.Last.Show.14.10.04/John Peel Last Show 14.10.04 digital recorded (q8).mpc with mime text/plain and service type 6

13 Dec 2006, 16:21:32:906 - Please wait while data is being flushed to the inverted word index... 13 Dec 2006, 16:21:48:288 - flushing data (36255 words left) to inverted word index - please wait
...
13 Dec 2006, 16:23:46:572 - flushing data (355 words left) to inverted word index - please wait

It needs more than 2 minutes to write something like 'utf-japanese/chinese' to the database!

we normally flush when 6000 words have built up but in your case when a binary file is misdiagnosed as text we can end up with huge numbers of words (36555). We are not optimized for dealing with such large amounts but then that should never occur in practice.


I haven't added the correct mime type for .mpc and .ape - still looking where to add/modify things :(

Anyway, a binary type file like musepack or monkey audio should not be treated as unikode text -
it can't be really legal utf-code!

we check the first 4kb and only treat an unidentified file as text if that first 4kb is legal utf-8.


May be a 'file'-check for a catch-all-mime-type could prevent this?

yeah maybe if a file is suspicious then 'file' could be used to check as a last resort.

thanks for the suggestion (I will try and squeeze it into next release)


--
Mr Jamie McCracken
http://jamiemcc.livejournal.com/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]