On Tue, 29 Jan 2002 21:41:08 +0100, "CC" == Cyrille Chepelov <cyrille chepelov org> wrote:
CC> OK, here are the results: CC> - test.c is basically a stripped down, hardcoded-to-latin1 version CC> of charconv.c (it's encoded in utf-8. I hope the test files you sent me CC> weren't swear words <grin/> They looked definitely Japanese in my emacs21.) CC> There are four strings: one latin1 (expected to convert), and three which CC> are not expected to convert into latin1 (for various but obvious reasons). CC> - test.log is the result of the test, with 2>&1. CC> As you can see, unicode_iconv() just bails out (and sets errno) when CC> the string is not convertible. CC> I'm thinking about adding a try_charconv_utf8_to_local8() function (taking CC> all code from charconv_utf8_to_local8() until before the test on the result CC> of unicode_iconv(), and letting it return NULL (but silently !) if the input CC> string can't be converted to local charset. This should allow to detect CC> whether the « and » characters are convertible in the current encoding. Hmm, please look at attached file. Japanese has similar characters like \xab and \xbb for example (but I don't like those though..). I mean my opinion is, it should entrust to translator than you write more codes. I'm not sure other languages may has similar characters too. CC> Problem: I see there's an alternate implementation of CC> charconv_utf8_to_local8, which basically delegates to glib1.3. Is this CC> function silent when presented with "bad" input ? Or is it safe to assume CC> we're going to either HAVE_ICONV or HAVE_UNICODE even in the glib1.3 case CC> and use code derived from the older implementation of charconv_utf8_to_local8 ? CC> Now people are talking of C++0x, I'll probably write to Mr. Sutter so that CC> the Powers That Be (and Who Talk To The C Comittee) seriously plan of adding CC> #mess, #beware, #horrible and #hell pre-processor directives. -- Akira TAGOH : tagoh gnome gr jp / Japan GNOME Users Group at gclab org : tagoh gnome-db org / GNOME-DB Project : tagoh redhat com / Red Hat, Inc. : tagoh debian org / Debian Project
Attachment:
test.gz
Description: Binary data