Re: detecting non-convertibility of characters



On Tue, 29 Jan 2002 21:41:08 +0100,
"CC" == Cyrille Chepelov <cyrille chepelov org> wrote:

CC> OK, here are the results:
CC>     - test.c is basically a stripped down, hardcoded-to-latin1 version
CC> of charconv.c (it's encoded in utf-8. I hope the test files you sent me
CC> weren't swear words <grin/> They looked definitely Japanese in my emacs21.)
CC> There are four strings: one latin1 (expected to convert), and three which
CC> are not expected to convert into latin1 (for various but obvious reasons).
CC>     - test.log is the result of the test, with 2>&1.

CC>     As you can see, unicode_iconv() just bails out (and sets errno) when
CC> the string is not convertible.

CC> I'm thinking about adding a try_charconv_utf8_to_local8() function (taking
CC> all code from charconv_utf8_to_local8() until before the test on the result
CC> of unicode_iconv(), and letting it return NULL (but silently !) if the input
CC> string can't be converted to local charset. This should allow to detect
CC> whether the « and » characters are convertible in the current encoding.

Hmm, please look at attached file. Japanese has similar
characters like \xab and \xbb for example (but I don't like
those though..). I mean my opinion is, it should entrust to
translator than you write more codes. I'm not sure other
languages may has similar characters too.

CC> Problem: I see there's an alternate implementation of
CC> charconv_utf8_to_local8, which basically delegates to glib1.3. Is this
CC> function silent when presented with "bad" input ? Or is it safe to assume
CC> we're going to either HAVE_ICONV or HAVE_UNICODE even in the glib1.3 case
CC> and use code derived from the older implementation of charconv_utf8_to_local8 ?

CC> Now people are talking of C++0x, I'll probably write to Mr. Sutter so that
CC> the Powers That Be (and Who Talk To The C Comittee) seriously plan of adding
CC> #mess, #beware, #horrible and #hell pre-processor directives.

--
Akira TAGOH  : tagoh gnome gr jp  / Japan GNOME Users Group
at gclab org : tagoh gnome-db org / GNOME-DB Project
             : tagoh redhat com   / Red Hat, Inc.
             : tagoh debian org   / Debian Project

Attachment: test.gz
Description: Binary data



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]