Re: [Tracker] tracker-indexer does not index all files



Jamie McCracken wrote:
Another potential crasher - unlike trunk get_file_content does no utf-8
validation and also if file is bigger than MAX_TEXT cuts it off which is
likely to not land on a valid utf-8 word break

This is true.

ideally do what trunk does and read file line by line so that we will
never have a partial utf-8 fragment and the resulting text can be
validated and converted from locale to utf-8 if necessary

I don't think reading line by line is a good idea at all.
All we need to do is use g_utf8_validate () on the length we read and
find out where the end is and make sure we don't read half way through a
UTF8 character.

ïThis needs to be fixed prior to merge!

This should be a 2 minute job.

-- 
Regards,
Martyn



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]