Re: char representation



Miroslaw Dobrzanski-Neumann <mne mosaic-ag com> writes:

> Hi all,
> 
> visiting some sources I saw code fragments that relay on a ascii like internal
> character representation:
> 
> here just one I've picked up
> gtype.c:537:  name_valid = (p[0] >= 'A' && p[0] <= 'Z') || (p[0] >= 'a' && p[0] <= 'z') || p[0] == '_';
> 
> the same in the pango package.
> 
> on systems such IBM host S/390 running Operating System OS/390 (MVS) the internal character
> representation is EBCDIC for which p[0] >= 'A' && p[0] <= 'Z' does not hold
> when looking for an uppercase letter.
> 
> Generally there is no guarantee that any character range (digits, letters,
> ...) have consecutive values. There is also no guarantee for 'A' < 'Z'.
> 
> The correct way to deal with this is "isupper (c)" or something like this.
> I guess g_ascii_isupper() would do this job perfectly, because it is based on
> character attributes and not on on their internal representation.

There are only two forms of 8-byte character representations we support
now:

 ASCII with uninterpreted chars > 127. [E.g. g_ascii_*]
 UTF-8                                 [g_utf8_*]

It's an important property that the valid strings for the second 
are a subset of the first.

The g_ascii_* functions are meant to have a standard meaning
for each given byte and not to magically change into 
g_ebcdic_...; EBCDIC is not UTF-8 compatible.

I'm afraid that EBCDIC systems will just be shut out from using
GTK+. I really don't think that is going to be a major limitation
in GTK+'s success.

Regards
                                        Owen



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]