Re: unicode sorting algorithm?
- From: Owen Taylor <otaylor redhat com>
- To: Petr Tomasek <tomasek etf cuni cz>
- Cc: gtk-i18n-list gnome org
- Subject: Re: unicode sorting algorithm?
- Date: 21 Jun 2001 15:50:00 -0400
Petr Tomasek <tomasek etf cuni cz> writes:
> Hello!
>
> I'm just curious if there exists an unicode, fully internation, sorting
> algorithm. I need to make database of my bibliography, which counts books
> in czech, english, german, modern hebrew and arabic. Up to now don't
> have them stored in computer due to lack of international support.
>
> Is there any standart way to do sorting on multilingual unicode text?
See http://www.unicode.org/unicode/reports/tr10/
The algorithm there, when applied using the default properties without
tailoring provides _a_ sorting order for all Unicode strings. Since
different languages have different, and incompatible, conventions for
ordering such an order is, at best, a compromise.
Also, for East-Asian text, conventional sorting is often by
pronunciation, which this algorithm makes no attempt to do. (For
Japanese, determining pronunciation from the written form of a word is
not always easy even for a native-speaking human.)
> More specific: how will sorting be solved in the gtk-2/gnome-2 platform?
> Or should each developer write his own sorting routines?
We'll probably have some simple hack, that will work well if:
a) your platform has good Unicode sorting support
b) you are running in an UTF-8 locale
and more or less well otherwise. See:
http://bugzilla.gnome.org/show_bug.cgi?id=55836
http://bugzilla.gnome.org/show_bug.cgi?id=55852
It's definitely going to be an area for future improvement past GLib-2.0.
Regards,
Owen
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]