Re: [Tracker] musepack & monkey audio files treated as text/plain
- From: Jamie McCracken <jamiemcc blueyonder co uk>
- To: Rob Til Freedmen <rob til freedman googlemail com>
- Cc: tracker-list gnome org
- Subject: Re: [Tracker] musepack & monkey audio files treated as text/plain
- Date: Wed, 13 Dec 2006 21:38:56 +0000
Rob Til Freedmen wrote:
trackerd treats musepack & monkey audio files as 'mime text/plain and
service type 6'
From the tracker.log:
13 Dec 2006, 16:21:04:379 - Extracting Metadata for *new* file
/mnt/sound/Oldies/John.Peel.Last.Show.14.10.04/John Peel Last Show
14.10.04 digital recorded (q8).mpc with mime text/plain and service type 6
13 Dec 2006, 16:21:32:906 - Please wait while data is being flushed to
the inverted word index...
13 Dec 2006, 16:21:48:288 - flushing data (36255 words left) to inverted
word index - please wait
...
13 Dec 2006, 16:23:46:572 - flushing data (355 words left) to inverted
word index - please wait
It needs more than 2 minutes to write something like
'utf-japanese/chinese' to the database!
we normally flush when 6000 words have built up but in your case when a
binary file is misdiagnosed as text we can end up with huge numbers of
words (36555). We are not optimized for dealing with such large amounts
but then that should never occur in practice.
I haven't added the correct mime type for .mpc and .ape - still looking
where to add/modify things :(
Anyway, a binary type file like musepack or monkey audio should not be
treated as unikode text -
it can't be really legal utf-code!
we check the first 4kb and only treat an unidentified file as text if
that first 4kb is legal utf-8.
May be a 'file'-check for a catch-all-mime-type could prevent this?
yeah maybe if a file is suspicious then 'file' could be used to check as
a last resort.
thanks for the suggestion (I will try and squeeze it into next release)
--
Mr Jamie McCracken
http://jamiemcc.livejournal.com/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]