Re: Faster UTF-8 decoding in GLib



On 03/27/2010 05:49 PM, Daniel Elstner wrote:
> Hi,
> 
> Am Samstag, den 27.03.2010, 17:40 -0400 schrieb Behdad Esfahbod:
>> On 03/27/2010 05:21 PM, Daniel Elstner wrote:
>>> Well, I assume that ints are at least 32 bit wide on any platform
>>> supported by GLib.  But if you meant to say that it would break with
>>> larger ints, I don't see why.  As long as the type is unsigned, it
>>> should be fine.
>>
>> If the utf8 byte has more than 6 leading 1 bits, [...]
> 
> That's an oxymoron.
> 
>> [...] and with a 64bit int, the
>> construct tries to consume 7 or 8 bytes.  Right?
> 
> Undefined behavior.

Sure, I wasn't referring to valid data.  In valid UTF-8, there is no 5byte or
6byte sequences either.

b

> --Daniel


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]