Re: Faster UTF-8 decoding in GLib

From: Behdad Esfahbod <behdad behdad org>
To: Daniel Elstner <daniel kitta googlemail com>
Cc: gtk-devel-list gnome org
Subject: Re: Faster UTF-8 decoding in GLib
Date: Sat, 27 Mar 2010 18:04:29 -0400

On 03/27/2010 05:49 PM, Daniel Elstner wrote:
> Hi,
> 
> Am Samstag, den 27.03.2010, 17:40 -0400 schrieb Behdad Esfahbod:
>> On 03/27/2010 05:21 PM, Daniel Elstner wrote:
>>> Well, I assume that ints are at least 32 bit wide on any platform
>>> supported by GLib.  But if you meant to say that it would break with
>>> larger ints, I don't see why.  As long as the type is unsigned, it
>>> should be fine.
>>
>> If the utf8 byte has more than 6 leading 1 bits, [...]
> 
> That's an oxymoron.
> 
>> [...] and with a 64bit int, the
>> construct tries to consume 7 or 8 bytes.  Right?
> 
> Undefined behavior.

Sure, I wasn't referring to valid data.  In valid UTF-8, there is no 5byte or
6byte sequences either.

b

> --Daniel

Follow-Ups:
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner

References:
- Faster UTF-8 decoding in GLib
  - From: Mikhail Zabaluev
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner
- Re: Faster UTF-8 decoding in GLib
  - From: Behdad Esfahbod
- Re: Faster UTF-8 decoding in GLib
  - From: Daniel Elstner

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]