Re: Unicode and C++



Nathan Myers wrote:

> Manipulating UTF-8 in memory is pathetic.  UTF-8 is compact and
> convenient as a network and file format representation, but it sucks
> rocks for string manipulations, or in general for in-memory operations.
> Things that are naturally O(1) become O(n) for no reason better than
> sheer obstinacy and stubbornness.

The moment someone suggests UTF-8 is good you can tell their mother tongue is
English. That said, I think your are banging your head against a brick wall if
you think you can do anything about the growing tide of UTF8. It sucks, but
live with it. To do otherwise will just lead to further fragmentation.

In fact, I do see a brighter side to the growth of UTF-8. It may cause bloat in
any language other than English, but Unicode is currently heading in a more
bizarre direction - prefix codes to extend beyond 64K characters without going
to 32 bit characters. That gives you tha maximum messyness in every direction.
If UTF-8 becomes more widespread in a timely manner it might stop this
insanity. Unicode was a fine opportunity to produce a clean legacy free
character set, but it has been turned into a worse hotch-potch than anything
which preceeded it.

Lets just get through the pain of converting the whole world to UTF-8, and at
least get down to just one crappy coding scheme.

Steve






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]