Re: UTF-8 on stdout?
- From: Andrew Ferrier <andrew junk new-destiny co uk>
- To: dia-list gnome org
- Subject: Re: UTF-8 on stdout?
- Date: Sun, 7 Jul 2002 10:43:15 +0100 (BST)
On 2002-07-06 at 21:42 -0400, James K.Lowden wrote:
PMJI. You might want to have a look at http://czyborra.com/
Mr. Czyborra has a pretty good overview of what's what
regarding encoding and character sets, and does a good job of
distinguishing between fonts, glyphs, and characters. You
may in particular want to look at:
http://czyborra.com/unicode/terminals.html
This certainly seems a pretty good site. I'll have to take a
long look at it sometime but it's answered a few questions
already... thanks for the reference!
What you bumped into was, as Lars said, a problem with xterm.
If you push UTF-8 to stdout, it falls to the application
whose job it is to convert encoded values into glyphs that
your brain can interpret as characters (I'm skipping a few
steps). The standard xterm is *not* going to expect UTF-8;
it will instead interpret the bytestream as ASCII or Latin-1
or whatever your locale settings indicate.
Yep. However, it does appear from czyborra that there is an
escape sequence to make UTF-8 hacked 4.0 xterms switch into
UTF-8 mode. I'll investigate this and give it a try. Not sure
if it's the kind of thing that Dia should be outputting
however... probably more of a user/system-wide thing.
dia --credits |sort
how is sort(1) supposed to know what's incoming? It doesn't
guess; it assumes, and unless the answer is 7-bit ascii, it
assumes wrong. Its only defense is, it's got a lot of good
company.
Good point. In this case I'm not going to worry because the
names are not surname, forename anyway (which is conventional
in most locales I think), and there is surrounding bumpf too.
But in a more general case that is very important I guess.
Interesting place. In particular, the -u8 option for xterm
does exactly what Andrew wants. We should get Akira and Xing
Wang to use their utf8 encodings for their names.
Yes, I guess so. I'll continue outputting in UTF-8 then: I'll
assume it's the responsibility of the user to sort out their
terminal if they want 'correct' output.
Cheers for all that guys,
Andrew.
--
Andrew Ferrier
email: andrew junk new-destiny co uk
web: http://www.new-destiny.co.uk/andrew/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]