Re: UTF-8 on stdout?



On Sat, 6 Jul 2002, James K. Lowden wrote:
It's an interesting problem.  If you enter:

dia --credits |sort

how is sort(1) supposed to know what's incoming?  It doesn't guess; it
assumes, and unless the answer is 7-bit ascii, it assumes wrong.  Its
only defense is, it's got a lot of good company.

Actually, czyborra mentions as one of the strengths of utf-8 that you can
run it through an old-style sort and get the same sorting as on UCS-4
chars.

Very good run-through, though hard to read in places.  Finally rid me of
confusing Unicode and UTF-8.

-Lars

-- 
Lars Clausen (http://shasta.cs.uiuc.edu/~lrclause)| Hårdgrim of Numenor
"I do not agree with a word that you say, but I   |----------------------------
will defend to the death your right to say it."   | Where are we going, and
    --Evelyn Beatrice Hall paraphrasing Voltaire  | what's with the handbasket?



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]