Re: [gnumeric-list][PATCH to gnumeric] handle unicode and codepages for Import/Export of Excel files



 Hi, 

 Sorry for replying this way - I killed the message I'm replying to in my
mailbox accidently.

On Mon, Mar 12, 2001 at 12:26:13PM +0400, gnumeric-list-admin gnome org wrote:

 It would be nice if this patch was included in Gnumeric that will be shipped
with gnome-1.4.

I can not include it in the next gnumeric release 0.64.  If there is
another release before 1.4 then this might go in, but it is too
significant a change to make this late.  The majority of the
utilities (modulo minor nit-picks below) look good.  However, I want
to look over the changes to the string reading routine very
carefully.  That is too central a routine to mess with lightly and
the trouble we went through getting the SST reading correct are not
things to take lightly.

 OK, I understand your concerns. If all that you are unsure about is reading
sheet names, then let's leave the old code there (just replace "#if 0" with
"#if 1"). If you are unsure about the whole string reading routine - I tested
it carefully - everything works fine with huge .xls files, reading and
writing, any format. If conversion to local charset using iconv fails,
original text is used, so it's bulletproof IMO and shouldn't break anything.
 
Sorry.

Please supply a ChangeLog entry for your patches.

 Here are them:
For printing patch:

*  Make printing working correctly under any locale, not only under latin1
ones - patch from  Vlad Harchev <hvv hippo ru>

For themes patch (note that in 1st window, sheet names on tabs and text in
status line is still drawn in black - it seem they are drawn before
initialization or while some gtk style is pushed - subsequent windows - like
created by View->New unshared -  use proper colors):
* Use current gtk theme's text color instead of always black for drawing row
 and column headers, sheet names on tabs and in status line -  patch from
 Vlad Harchev <hvv hippo ru>

For Excel IO patch:
* Convert all strings to/from locale's charset when reading/writing .xls
files - patch from  Vlad Harchev <hvv hippo ru>

 
@@ -587,12 +594,23 @@
              ans->hidden = MS_BIFF_H_VISIBLE;
              break;
      }
+#if 0
      if (ver == MS_BIFF_V8) {
-             int slen = MS_OLE_GET_GUINT16 (q->data + 6);
+             int slen = MS_OLE_GET_GUINT16 (q->data + 6);
              ans->name = biff_get_text (q->data + 8, slen, NULL);
-     } else {
+     } else
+#endif
+     {
+             /*
+              * there are test files produced by non-latin1 Excel (e.g.
+              * russian version) that prove that branch above is
+              * incorrect. It seems test files that insured author of branch
+              * above were produced by latin1 version of Excel -
+              * in that case q->data[7] is always 0, so it can be attributed
+              * to length of sheet name or to the string header.
+              *                      - Vlad Harchev <hvv hippo ru>
+              */
              int slen = MS_OLE_GET_GUINT8 (q->data + 6);
-
              ans->name = biff_get_text (q->data + 7, slen, NULL);
      }
Please forward copies of these workbooks to me for inclusion in the
collection of test files.

 I will mail it to you privately. Though ANY .xls file produced by non-latin1
version of Excel97 or Excel 2k can be a sample. And also I should note
that excel 2000 doesn't allow sheet names longer than approximately 30
characters - yet another point that my changes are correct.

diff -ru -x po gnumeric-0.63~/plugins/excel/ms-excel-util.c gnumeric-0.63/plu
gins/excel/ms-excel-util.c
--- gnumeric-0.63~/plugins/excel/ms-excel-util.c      Tue Oct 31 20:21:05 200
0
+++ gnumeric-0.63/plugins/excel/ms-excel-util.c       Mon Mar 12 10:40:20 200
1

+/* comment out this if you don't have iconv available */
+#define HAVE_ICONV
+#ifdef HAVE_ICONV
+#include <iconv.h>
+#endif
 extern int ms_excel_read_debug;
This should be done via a configure.in test.

 OK, I'll try to add it this way. I plan to define HAVE_ICONV_H if
<iconv.h> exists - I hope this check will be sufficient.

+static char*
+get_locale_charset_name()
This routine does not belong in the XL plugin.  We can put it in
gnumeric or even better in GAL.

 glib-1.3 and higher already has such function - 

/* Returns TRUE if current locale uses UTF-8 charset.  If CHARSET is
 * not null, sets *CHARSET to the name of the current locale's
 * charset.  This value is statically allocated.
 */
gboolean g_get_charset (char **charset);

 Also current libunicode has such function too (it's stolen from glib-1.3) -
so I think this function can be moved only around gnumeric, but not
outside of it.

[...]
+     else {
+             char* locale = setlocale(LC_CTYPE,NULL);
+             char* lang_sep = strchr(locale,'_');
+             if (lang_sep)
+                     lang = g_strndup(locale,lang_sep-locale);
looks like a leak
+             else
+                     lang = locale;
+     }
+     lang = g_strdup(lang);
looks like another leak

 This code is executed only once - may be I just need to put a note 
near these calls so leak-hunters won't be bothered by them?

+     /*now search for that language in 'cyr_locales'*/
Please correct comment

 OK, will do.

 Best regards,
  -Vlad

My sig: I want to be hired by Ximian or Eazel and work on Gnome. Really.






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]