Re: [Tracker] nie:plainTextContent, Unicode normalization and Word breaks

From: Jamie McCracken <jamie mccrack googlemail com>
To: Martyn Russell <martyn lanedo com>
Cc: "Tracker \(devel\)" <tracker-list gnome org>
Subject: Re: [Tracker] nie:plainTextContent, Unicode normalization and Word breaks
Date: Fri, 23 Apr 2010 09:32:35 -0400

On Fri, 2010-04-23 at 09:17 +0100, Martyn Russell wrote:

Thanks Aleksander.

I think it makes sense to fix this. Just to be clear, does this mean we 
don't need Pango in libtracker-fts/tracker-parser.c to determine word 
breaks for CJK?


Thats not broken so would not recommend trying to "fix" that

IMHO, The tracker_text_normalize() in the extractor should just do utf8
validation. It should not attempt word breaking as thats cpu expensive
and being done by the parser already

jamie

Follow-Ups:
- Re: [Tracker] nie:plainTextContent, Unicode normalization and Word breaks
  - From: Aleksander Morgado

References:
- [Tracker] nie:plainTextContent, Unicode normalization and Word breaks
  - From: Aleksander Morgado
- Re: [Tracker] nie:plainTextContent, Unicode normalization and Word breaks
  - From: Martyn Russell

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]