Re: Unicode?



2007/5/29, Ken Harris <kengruven gmail com>:
Hi Joe,

> So I am definitely not an expert in these matters.  But my
> understanding is that Mono internally uses UTF-16 as its Unicode
> representation.

Well, yeah, kind of.  I'm no expert with C#, but it seems to mean
"here's a 16-bit type, have fun".  I'm hesitant to call that
"internal".  :-)  I think it's only slightly more true than saying "C
uses UTF-8 internally" (here's an 8-bit type, have fun).

(This is how Microsoft defines it, and I can only assume Mono does
something similar.)

In C#, System.String is defined as "Represents text as a series of
Unicode characters." and the internal way to handle this is to use
16-bit characters, which effectively means UTF-16.

Also, the class System.Char, which is what a System.String is built up
with (or rather, what you get when you index a string - e.g, "char
myChar = myString[1];" ) is a 16-bit struct that "represents a Unicode
character".

(Text within quotes are directly from MSDN)
[...]
> As far as Beagle is concerned, by itself it doesn't deal with
> character encodings at all.  As far as underlying libs: GTK requires
> UTF-8; underneath it GLib deals with different Unicode versions.

Since C# doesn't really provide a "unicode character" type (only a
16-bit type for stuffing with UTF-16), a program that wants to fully
support Unicode might need to deal a little bit with one encoding
(UTF-16) itself.  But I'm new to Mono, and I'm not sure my previous
sentence is true.  :-)

I don't know what you mean with "unicode character", but it seems
System.Char fills your description pretty well.

There's also a bunch of conversion and other utility functions for
string encodings in the System.Globalization namespace. But this is
not C# specific per se, but related since it's part of the .NET
framework.

-Isak



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]