Re: UTF-8: Case mapping
- From: Raymond Wan <rwan cs mu oz au>
- To: Pablo Saratxaga <pablo mandrakesoft com>
- Cc: gtk-i18n-list gnome org
- Subject: Re: UTF-8: Case mapping
- Date: Thu, 28 Jun 2001 22:21:34 +1000 (EST)
Hi,
On Thu, 28 Jun 2001, Pablo Saratxaga wrote:
> > I agree with most of what you said, but I can see a practical reason why
> > your last point is a poor solution. Most people only know one or two
> > languages. We lack the skills needed to build any generally meaningful
> I wonder... the GNU libc has very complex and comprehensive per language
> (per locale even) sorting rules; and, from the source files at least, it seems
> it is possible to find the base letter of a given char in case it has
> accents, or know if it is an upper or lower case, or which script it is
> from. So, I don't understand very much the reason of this thread; what is
> exactly the problem?
Just wondering as I don't know the GNU libc very well, but would
libc be able to handle variable length bytes for a given character? For
example, I presume Japanese and Chinese C libraries would sort assuming
the input was two byte strings (S-JIS for Japanese and Big5 for Chinese)
[I honestly don't know what would happen if you threw ASCII characters
into a Japanese / Chinese text file; though I know there is a 2 byte and a
1 byte version of the letter "a".]
I partly agree with your message, but for the most part, I also
don't understand the problem at hand very well. As UTF-8 is not used
fully yet and will be used more often in the next few years, it's hard to
predict what a typical user's needs will be.
I think having some basic sorting for 2.0 (i.e., primitives for
users to build on) and waiting until UTF-8 catches on to see what is
popular sounds like one valid idea to me...
Ray
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]