Re: g_utf8_validate() and NUL characters


On Tue, Oct 7, 2008 at 5:50 PM, Brian J. Tarricone <bjt23 cornell edu> wrote:
> I think what he really meant (or if not, here's my take on it) was that NUL
> bytes aren't *printable* text... like you'd say of low-value ASCII data.
>  Sure, it's technically "text," but most of it isn't something you can
> represent visually in a useful manner.

Exactly. I don't see why you would ever want a nul byte, in a
situation where text is expected.

Another way to put it, I don't think nul bytes are a user-explainable
concept. If anybody who isn't a programmer sees (how? what's the
glyph?) a nul byte in a _text_ file, that's just bizarre. In fact, why
would anybody want that? In a binary file sure. But binary files
aren't utf8 _at all_.

As a side issue, I think in most cases programs likely break if they
load a non-nul-terminated string, so it's convenient if
g_utf8_validate() is catching that.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]