Re: Gtk::Text widget

-> UTF-16 is only a multi-byte encoding in the sense that it takes more
-> than one byte to represent a character, and it takes 2 or 3
-> (surrogates) bytes for each character, never 4.

	Man, this stuff is confusing.  I don't have the Unicode book, so I
was going off this (from "CJKV Information Processing", by Ken Lunde,
page 195):

"UTF-16 encoding is therefore a variable-length encoding that employs a
mixed 16- and 32-bit code space.  This effectively means that software
that processes UTF-16 encodign internally must deal with issues similar to
those in legacy systems that use variable-length encodings."

	To a novice like me, that sounds like UTF-16 uses either 2 or 4
bytes per character, which is why I assumed your gapped buffer was really
using UCS-2.  My apologies.

-> Overlapping tags is going to be fun :-)  I have them in my widget, but it is a
-> really poor implementation.  I haven't had time to revisit them.

	I'd love to know what your first iteration looked like, if you
care to email me privately about what you did.  (And, is this widget of
yours Open Source?  If so, where is it?)

-> Out of curiosity, which regex package are you intending to use?

	Haven't looked at that yet.  I was thinking of using the Glib-ized
regex that somebody posted to this list a few weeks ago.  I will take a
look at your URL, though.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]