Re: Glib Unicode regex (was: Gtk::Text widget)



Mark Leisher <mleisher@crl.nmsu.edu> writes:
> 
>     Derek> * Will the GTK+ team insist on a regex lib that uses UTF-8
>     Derek> internally?
> 
> It kind of has to.  Otherwise you will run into either too much memory
> allocation/deallocation or re-entrancy problems with static buffers.
>

We were looking at the Tcl regex lib (I think Owen maybe said it was
based on Mark's lib?) - and wondering about this issue. We were
worried about having to convert the regex and the search string
to/from UTF8, since that seems to make things slow. But maybe a slow
facility is better than none, and we could fix it later. No decision
reached.
 
>     Derek> * Are there any other Unicode-supporting regex libs we can look at?
> 
> Although I haven't checked the copyright, IBM's ICU library has one.
> The latest version of Perl has the best and most complete
> implementation I've seen yet, but it would be tough to untangle it
> from the surrounding code.

The Perl one looks viciously difficult to extract from Perl. I haven't
looked at the ICU engine, I'll ask Owen about it, I know he's looked
at ICU in general.

Havoc





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]