Re: UTF-8 Strings



Jeff Franks <jcf tpg com au> writes:
> I need some advice. I have been reading several message threads from the
> middle of last year bewteen Havoc Pennington and Nathan Myers about a
> possible utf8-string class and was wondering what the end result of that
> discussion was? 

What I took from it (after fooling with various implementation
attempts) is "std::string sucks" and "there is no good way to do
this," among other lessons. ;-)

IIRC:

 - if you implement the (huge and bloated) std::string interface, 
   all uses of it will be insanely inefficient, unless you do the 
   "monster string which caches both UTF-8 and fixed-width 32-bit
   encoding" approach and then you'll be insanely bloated.
 
 - if you do your own interface, it will be nonstandard. More so
   because I'm not sure it's even possible to do the usual STL 
   container semantics with UTF-8 (you invalidate iterators at 
   odd times, and if the container appears as container<gunichar>
   then it's painful or impossible to implement operator& in 
   the way it works for string or vector)

My memory is pretty fuzzy though, and I never came up with anything
that made me happy.

Havoc






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]