Re: Gtk::Text widget



Mark Leisher <mleisher@crl.nmsu.edu> writes: 
> Is there really somebody using UTF-8 as the internal encoding in a text
> widget?  I found it to be more hassle than it was worth and didn't provide
> enough memory or speed savings in the general case to keep it.
>

Sure, Tk and now GTK. It isn't for memory/speed; it's because GTK uses
UTF8 throughout for backward-compat reasons, and this way we avoid
conversions back and forth.
 
Of course UTF8 ruins the benefits of a gapped buffer, so you don't
want to use it there. This was one reason to avoid the gapped buffer
design.

> I would like to point out that the gap buffer I'm using doesn't noticeably
> suffer performance-wise until about 4M of text on our Solaris boxes.  IMHO,
> this is good enough for a general text editor :-)

It depends on what features you're using and what operations you're
performing when handling the 4M of text.

The tree-based widget is basically like a word processor buffer;
think of "show codes" mode in WordPerfect. The tags and marks can be
used to represent almost anything, as with word processor or XML tags,
and you can apply multiple tags to one region of text.

If you use the one-16-bit-number-per-char strategy to tag text, you
are fundamentally more limited than this; you can't apply more than
one style to a single character. Which offloads tons of bookkeeping on
the application programmer as soon as you do more than syntax
highlighting.

I guess you could have the number represent a list of properties, but
then you're starting to talk about more complexity and memory usage.
You could also start needing more than a 16-bit number fairly quickly.
(Properties stored in the Tk buffer include color, size, font,
overstrike, underline, background, custom key and button event
handlers, word wrap, editability, margin, space above and below lines,
visibility of the text, language the text is in, justification,
super/subscript, and any random metadata the user associates with a
given tag.)

It's O(n) in the length of the buffer to search for a property, and
O(n) in the length of the region being affected to add or remove a
property.

The gapped buffer also presents problems if you want to allow
embedding of images, widgets, marks, etc. in the buffer; most of these
have to be stored separately, instead of in the buffer itself, and
this creates a series of data structures you have to keep in sync =
maintenance headache, speed hit, and memory usage.

All the gapped buffer implementations I've looked at (Emacs, kwrite,
GtkText) lack these sorts of features. It would be interesting to see
the XEmacs gapped buffer though (I assume it has one), since there are
more features there. I'm not sure if they fixed the scrollbar
though. ;-)

Anyway, I expect the ease of use and expressive power of the
tree-based widget to be the most interesting aspect of it.

Havoc




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]