Re: [Tracker] nie:plainTextContent, Unicode normalization and Word breaks

On Fri, 2010-04-23 at 09:17 +0100, Martyn Russell wrote:

Thanks Aleksander.

I think it makes sense to fix this. Just to be clear, does this mean we 
don't need Pango in libtracker-fts/tracker-parser.c to determine word 
breaks for CJK?

Thats not broken so would not recommend trying to "fix" that

IMHO, The tracker_text_normalize() in the extractor should just do utf8
validation. It should not attempt word breaking as thats cpu expensive
and being done by the parser already


