Re: g_utf8_validate() and NUL characters

From: "Havoc Pennington" <hp pobox com>
To: "Brian J. Tarricone" <bjt23 cornell edu>
Cc: gtk-devel-list gnome org
Subject: Re: g_utf8_validate() and NUL characters
Date: Wed, 8 Oct 2008 07:20:42 -0400

Hi,

On Tue, Oct 7, 2008 at 5:50 PM, Brian J. Tarricone <bjt23 cornell edu> wrote:
> I think what he really meant (or if not, here's my take on it) was that NUL
> bytes aren't *printable* text... like you'd say of low-value ASCII data.
>  Sure, it's technically "text," but most of it isn't something you can
> represent visually in a useful manner.
>

Exactly. I don't see why you would ever want a nul byte, in a
situation where text is expected.

Another way to put it, I don't think nul bytes are a user-explainable
concept. If anybody who isn't a programmer sees (how? what's the
glyph?) a nul byte in a _text_ file, that's just bizarre. In fact, why
would anybody want that? In a binary file sure. But binary files
aren't utf8 _at all_.

As a side issue, I think in most cases programs likely break if they
load a non-nul-terminated string, so it's convenient if
g_utf8_validate() is catching that.

Havoc

Follow-Ups:
- Re: g_utf8_validate() and NUL characters
  - From: Nikolai Weibull
- Re: g_utf8_validate() and NUL characters
  - From: Behdad Esfahbod

References:
- =?utf-8?b?Z191dGY4X3ZhbGlkYXRlKCk=?= and NUL characters
  - From: coda
- Re: g_utf8_validate() and NUL characters
  - From: Havoc Pennington
- Re: g_utf8_validate() and NUL characters
  - From: Behdad Esfahbod
- Re: g_utf8_validate() and NUL characters
  - From: Brian J. Tarricone

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]