Re: Gtk::Text widget



Derek Simkowiak <dereks@kd-dev.com> writes: 
> -> I think "memory hog" is an overstatement; last time I measured it the
> -> overhead without tags was less than 3 bytes per character, including
              ^^^^^^^^^^^^
> -> all memory used by the GTK process
> 
> 	Whoa!  3 bytes per character?  Whatever happened to the 40-byte
> overhead of every tag toggle (and every Line parent node)?
> 

Note "without tags". But yes, the node and line overhead only comes to
something like 3 bytes per char. This is with 80 chars on a line,
mostly ASCII, and including the GTK runtime.

> 	Also, what about search (esp. regex) operations?  I think regex
> searching is a requirement of a good text widget; this is easy to do on a
> gapped text buffer, but requires splicing together the entire character
> buffer when your text is in a tree (does it not?).  If I remember
> correctly, the TkText has a simple search that it does by looking at the
> characters of the next node/line but no regex search.
> 

TkText has regex search, my widget doesn't because we don't have a
regex lib in GTK.

I have a plain text search; the way you do that is that you splice
together each line as you go through the buffer. This is still O(n)
search time though, the splicing adds a constant factor to your
search, but this is unlikely to be user-visible. It's still "fast
enough". Regex search would work the same way.

Consider that grep is sucking files off disk and into memory buffers;
it's extremely fast for users. There's no way the widget will be
slower than grep, since it isn't touching the disk. So it's hard to
imagine the widget being too slow for a search feature here.

> -> unrelated to the widget). Using lots and lots of tags could double
> -> your memory usage, but probably not much worse than that.
> 
> 	Can you post your numbers from MemProf?  Both with and without
> tags, if possible...  I'm really very curious about this.
> 

I haven't done it recently, the old numbers are in tktext-port/README
in CVS. 

> 	I don't currently have a dev environment where I can muck up my
> Gtk+ libraries/headers with the CVS version, so I haven't seen it in a
> while.
> 

The CVS version parallel installs, it won't break your current install.
 
> 	How does your new-and-improved version handle Unicode?  I've been
> planning on letting the user select between an 8-bit (ASCII), 16-bit
> (UCS-2), and 32-bit (UCS-4) gapped text buffer for internal storage.  At 3
> bytes per character, you must only be using one byte for encoding?
> 
> 	(Or are you using UTF-8 internally?)
> 

UTF8, all of GTK uses UTF8. Why would the user want to select this?
If they choose a sub-32-bit representation, then their application
will be broken. The only reason to do that is if you know you have
only ASCII, and for those characters UTF8 is an 8-bit representation
anyway, so you don't save any memory by using ASCII instead of UTF8.

(This is of course another problem with the gapped buffer, is that you
have to use a fixed-width encoding.)

Havoc




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]