Re: Hyphenation status

From: Damon Chaplin <damon kendo fsnet co uk>
To: Owen Taylor <otaylor redhat com>
Cc: gtk-i18n-list gnome org
Subject: Re: Hyphenation status
Date: 26 Nov 2002 23:37:59 +0000

On Mon, 2002-11-25 at 14:59, Owen Taylor wrote:

> If you have ideas about how to write a fast normalization function,
> they should be applied to g_utf8_normalize() 

I've rewritten g_utf8_normalize_wc() so you pass in a buffer and a size,
which avoids the need to do the decomposition step twice. Also, by using
this function directly I can avoid the conversion back to UTF-8 in
g_utf8_normalize(). (I need gunichar values anyway.)

Doing both of these doubles the speed of normalization, so I'm
reasonably happy with the performance. Hyphenation runs at about 340000
words a second, rather than 650000 without normalization. (But this is
when normalizing ASCII, which is a noop. It will be slower for other
languages.)

I've also added code to do the reverse mappings, so I think it handles
normalization now. (I need to test this though.)

Damon

References:
- Hyphenation status
  - From: Damon Chaplin
- Re: Hyphenation status
  - From: Owen Taylor

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]