Re: sorting strings with non-ASCII characters



We simply use what we have from glib; we don't do our own collating.

The situation is much more complicated than you might be realize (even
though if you read Knuth you have seen it all).

For example, \oe -> oe would be wrong in Danish.  You'd want more like
\oe -> \o [the 28th letter of the alphabet].  Similarly, \ae is not
related to ae
and properly sorts as the 27th letter.  It gets worse: \aa (a with ring on top)
can be written as aa, but even when it does, it sorts as the 29th letter of
the alphabet, i.e., last.  (Since Danish forms compound nouns by juxtaposition
one could imagine two different words spelled identially and sorting
differently.
Luckily I can't think of any.)

Note, that \ae is also used for latin and when it is works like ae.

What this boils down to is: if you try to fix it for one person, you
are likely to
utterly break it for someone else.  I doubt that we really want to get into this
mess ourselves, so you probably need to go to the glib people.

Morten



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]