Re[2]: Unicode and C++



hashao wrote:

> The GB18030-2000 has one, two, four byte characters. Included more
than
> 27,000 chars and extendable to 1500,000 chars. It is weird a
> non-unicode standard released at this moment of time. Maybe there are
> immediate need for more characters on the current non-unicode systems
> around the country.

Unicode 3 also has 27,000 Hanzi. Maybe that is no coincidence. It
wouldn't make
too much sense for a Chinese standard to blindly copy the exact Unicode
3
character set. I might at least expect them to clean up some of the
screwups in
the Unicode Hanzi, like mapping some China and Taiwan characters to
different
code points when they are actually the same (i.e. the PRC never
simplified those
characters). Without this you cannot reliably compare strings - a
significant
problem with Unicode.

Steve







[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]