Re: Faster UTF-8 decoding in GLib
- From: Daniel Elstner <daniel kitta googlemail com>
- To: Behdad Esfahbod <behdad behdad org>
- Cc: gtk-devel-list gnome org
- Subject: Re: Faster UTF-8 decoding in GLib
- Date: Sat, 27 Mar 2010 22:21:30 +0100
Am Samstag, den 27.03.2010, 16:51 -0400 schrieb Behdad Esfahbod:
> On 03/27/2010 04:27 PM, Daniel Elstner wrote:
> > It is not meant to check for errors.
> Good point.
> > I think it is totally arbitrary to handle some potential errors but not
> > others. And I think the current implementation does not do that check
> > either -- it will behave differently, but it is still undefined.
> The current implementation definitely does the check:
OK, looks like I misremembered. My bad. However, it is not documented
* @p: a pointer to Unicode character encoded as UTF-8
* Converts a sequence of bytes encoded as UTF-8 to a Unicode character.
* If @p does not point to a valid UTF-8 encoded character, results are
* undefined. If you are not sure that the bytes are complete
* valid Unicode characters, you should use g_utf8_get_char_validated()
* Return value: the resulting character
> Anyway. Nice construct :). For future reference, it must be used with 32bit
> ints only. Otherwise it can go wrong.
Well, I assume that ints are at least 32 bit wide on any platform
supported by GLib. But if you meant to say that it would break with
larger ints, I don't see why. As long as the type is unsigned, it
should be fine.
] [Thread Prev