Re: possible deadlock on invalid UTF-8 data
- From: Daniel Elstner <daniel elstner gmx net>
- To: Havoc Pennington <hp redhat com>
- Cc: gtk-devel-list <gtk-devel-list gnome org>
- Subject: Re: possible deadlock on invalid UTF-8 data
- Date: 27 Nov 2001 22:27:03 +0100
Am Die, 2001-11-27 um 21.54 schrieb Havoc Pennington:
>
> Daniel Elstner <daniel elstner gmx net> writes:
> > Yes, but as long as the pointer is not dereferenced it should work.
> > (Although ANSI C only guarantees that moving the pointer to a position
> > immediately after the last element will work, I consider failures when
> > moving it six bytes after the end very rare.)
>
> How many next_char loops don't dereference the char?
Well, the special case I'm talking about is the gtkmm wrapper for UTF-8
encoded strings. E.g. It makes use of g_utf8_pointer_to_offset() to
calculate the length of a string. (It doesn't use g_utf8_strlen()
because the size in bytes is already known, and to make it possible to
work with strings which contain the '\0' character.)
Glib::ustring of course has a validate() method. But that should be
called by the app programmer.
> > I absolutely agree with the policy. But if we can easily avoid an
> > endless loop even in case the programmer makes an error, shouldn't we
> > try to do so?
>
> On the other hand, the advantage of the endless loop (vs. reading
> invalid memory) is that the bug is immediately evident, and pretty
> easy to track down.
True. Though, in most cases the ustring will be passed to a GTK+
function, which will in turn validate the string and print a warning.
However, the problem isn't really important to me. It's just one thing
more we have to drum into the head of all gtkmm users: If app users are
starting to report livelocks, it'll probably have something to do with
invalid UTF-8 strings. :-)
Cheers,
--Daniel
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]