Re: possible deadlock on invalid UTF-8 data
- From: Havoc Pennington <hp redhat com>
- To: Daniel Elstner <daniel elstner gmx net>
- Cc: gtk-devel-list <gtk-devel-list gnome org>
- Subject: Re: possible deadlock on invalid UTF-8 data
- Date: 27 Nov 2001 15:14:12 -0500
Daniel Elstner <daniel elstner gmx net> writes:
> the utf8_skip_data array (glib/gutf8.c:104) contains 0 at index 0xfe
> and 0xff. This could easily cause endless loops when iterating over a
> UTF-8 string by using g_utf8_next_char().
> I know that 0xfe and 0xff are forbidden in UTF-8 strings, but those
> shouldn't cause a deadlock IMHO. Sometimes it's just not appropriate to
> validate every string before passing it to the g_utf8_* functions.
If you use next_char on invalid UTF-8, it can easily skip onto invalid
memory - so you have to validate first to be safe, even with the
change you've suggested.
The policy for GLib and GTK is that _all_ UTF-8 must be validated, and
that none of the functions are safe against invalid UTF-8, with a few
specific exceptions (the GMarkup parser is safe, and g_utf8_validate()
itself is obviously safe).
] [Thread Prev