Updated thai.c



Owen,

I updated all the comment in thai.c.
Could you check it out ?

Thanks,

Chookij V.


] Date: Tue, 14 Nov 2000 13:15:26 -0500
] From: Owen Taylor <otaylor redhat com>
] Subject: Re: Putback thai.c to CVS
] To: gtk-i18n-list gnome org
] Cc: Chookij Vanatham <chookij vanatham eng sun com>
] MIME-version: 1.0
] X-BeenThere: gtk-i18n-list gnome org
] Delivered-to: gtk-i18n-list gnome org
] User-Agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.7
] X-Mailman-Version: 2.0beta5
] List-Id: Internationalization and GTK+ <gtk-i18n-list.gnome.org>
] 
] 
] This looks good, but there are a few things I'd like to see fixed
] before it is committed to CVS.
] 
] I think it is very important that each module have a consistent
] identation style. This is for two reasons:
] 
]     - Improve legibility
]     - Prevent excessive diffs as each person editing the 
]       file reformats it to their own taste.
] 
] I'd strongly encourage using the Pango coding style for this
] (pango/docs/TEXT/coding-style) for this, but it is only mandatory for
] the Pango library and examples.
] 
] The thai.c module previously followed the Pango guidelines; if you
] need to use something else, please make the whole file use that
] consistently. Currently, there are places where within in a single
] block there are two different indentation levels...
] 
] 
] Additional detailed comments on parts of the file:
] 
]    * Copyright (C) 1999 Red Hat Software
] 
] Since you've added significant amounts of code, you should add the
] appropriate copyright notice here. ("Copyright 2000, Sun
] Microsystems", or whatever.) Adding or not adding the copyright notice
] does not change the copyright status of the code, but it makes it
] clear who the copyright holders of the code are.
] 
]   #include <iconv.h>
] [...]
]   #include <fribidi/fribidi.h>
] 
] The includes of iconv and fribidi were never necessary for thai.c,
] and actually now will cause compilation errors since I've removed the
] direct dependencies here.
] 
]  
]   #define	NoTailCons		_NC
]   #define	UpTailCons		_UC
]   #define	BotTailCons		_BC
]   #define	SpltTailCons		_SC
]   #define	Cons			
(NoTailCons|UpTailCons|BotTailCons|SpltTailCons)
]   #define AboveVowel		_AV
]   #define BelowVowel		_BV
]   #define	Tone			_TN
]   #define AboveDiac		_AD
]   #define BelowDiac		_BD
]   #define	SaraAm			_AM
] 
] It seems that these #defines are meant to be legible and the _NC-style
] short forms just meant for the table; I think it would make the code
] more legible if you used the long #defines everywhere but the table.
] 
]   /* Returns a structure with information we will use to rendering given the
]    * #PangoFont. This is computed once per font and cached for later 
retrieval.
]    */
]   static ThaiFontInfo *
]   get_font_info (PangoFont *font)
]   {
]     static const char *charsets[] = {
]       "xtis620.2529-1",
]       "xtis-0",
]       "tis620.2533-1",
]       "tis620.2529-1",
]       "iso8859-11",
]       "tis620-0",
]       "tis620-1",
]       "tis620-2",
]       "iso10646-1",
]     };
] 
] Is the ordering here still right? I would expect that tis620-1 and tis620-2
] should go at the top, since they should provide as good results as the 
] XTIS fonts, but with a more compact set of glyphs. They certainly will
] provide better results than the plain-tis fonts, so should be preferred
] to them.
] 
] (The reason I decided to do a XTIS shaper was mostly because it made a
] nice simple example, not because I really think people should have
] fully-precomposed XTIS fonts...)
] 
]         switch (num_chrs) {
]           case 1:
]             if (IsChrType(cluster[0], _BV|_BD|_AV|_AD|_TN)) {
]                 glyph_lists[0] =
]                           PANGO_X_MAKE_GLYPH (font_info->subfont, 0x7F);
]                 glyph_lists[1] =
]                           PANGO_X_MAKE_GLYPH (font_info->subfont,
]                                               ucs2tis(cluster[0]));
]                 return 2;
] 
] What do these fonts have in position 0x7f? If it is a character with negative
] advance width, I'm not sure Pango will handle this properly, but from
] the context it also seems like it might be a blank character of the width
] of the cell.
] 
]   /* Return the glyph code within the font for the given Unicode Thai 
]    * code pointer
]    */
]   get_glyph (ThaiFontInfo * font_info, gunichar wc)
] 
] The return value for this function seems to have been lost, but since
] you no longer are using it, it should just be removed.
] 
]   gboolean
]   IsWttComposible(gint cur_wc, gint nxt_wc)
] 
] This function should be static. Keeping exported symbols to a minimum
] prevents various problems.
] 
]   static char *
]   g_utf8_get_next_cluster(const char	*text,
]                           gint		length,
]                           gunichar	**cluster,
]                           gint		*num_chrs)
] 
] I think this name is less than ideal - since it is a static function it 
doesn't
] need to be namespaced. It definitely shouldn't be namespaced within
] the namespace of GLib UTF-8 manipulation functions. get_next_cluster()
] would be fine.
] 
]     if ((text == NULL) || (length == 0) ||
]         ((text) && (*text == '\0')) ) {
]         if (*num_chrs)
]             *num_chrs = 0;
]         if (cluster)
]             cluster[0] = 0;
]         return (char *)NULL;
]     }
] 
] If this check was ever hit, it would cause really bad results, since you
] are running this function within a loop
] 
]     while (p < text + length)
]       {
]         gunichar cluster[MAX_CLUSTER_CHRS];
]         gint num_chrs;
] 
]         log_cluster = p;
]         p = g_utf8_get_next_cluster(p, text + length - p, &cluster, 
&num_chrs);
]         add_cluster(font_info, glyphs, log_cluster - text, cluster, num_chrs);
]       }
] 
] So having get_next_cluster return NULL is not a good idea. However,
] the check will never be hit since a shaper is allowed to assume that
] it will be called with a non-NULL text with an accurate length. 
] 
] In general, I would tend to avoid this sort of "something went wrong,
] let me silently ignore it" check. Either you should simply assume that
] your arguments or valid, or if you check their validity you should do
] so in a way that you get useful debugging information if the check
] fails - the g_assert() and g_return_if_fail() macros are useful for
] this.
] 
]     p = text;
]     if (cluster)
]         cluster[0] = g_utf8_get_char(p);
] 
] Again, there is no point in trying to behave properly if someone calls
] you with cluster == NULL. Either assume that the arguments are right
] (after all, the function is called only once from immediately below)
] or put at the top of the function:
] 
]  g_assert (cluster != NULL);
] 
] What I expect you were copying here is that for public functions we
] generally allow the pointer passed in for out arguments to be NULL;
] however, that doesn't really apply to a static "helper" function that
] occurs only in one place.
] 
]     p = g_utf8_next_char(p);
]     do {
]       if (p >= text + length)
]           break;
]       if (cluster)
]           cluster[nChrs] = g_utf8_get_char(p);
]       if (IsWttComposible(cluster[nChrs - 1], cluster[nChrs])) {
]           nChrs++;
]           if (nChrs == 3)
]               ClusterNotFound = FALSE;
]           p = g_utf8_next_char(p);
]       } else {
]           if ((nChrs == 1) &&
]               IsChrType(cluster[nChrs - 1], Cons) &&
]               IsChrType(cluster[nChrs], SaraAm) ) {
]               nChrs = 2;
]               p = g_utf8_next_char(p);
]           } else if ((nChrs == 2) &&
]                      IsChrType(cluster[nChrs - 2], Cons) &&
]                      IsChrType(cluster[nChrs - 1], Tone) &&
]                      IsChrType(cluster[nChrs], SaraAm) ) {
]                      nChrs = 3;
]                      p = g_utf8_next_char(p);
]           }
]           ClusterNotFound = FALSE;
]       }
]     } while (ClusterNotFound);
] 
] I find the above a little confusing with the update of p and nChrs all
] over the place; can't you do something like the following?
] 
]   while (p < text + length && n_chars < 3)  
]     {
]       gunichar current = g_utf8_get_char (p);
]       
]       if (n_chars = 0 ||
] 	  IsWttComposible (cluster[n_chars - 1], current) ||
] 	  (nChrs == 1 &&
] 	   IsChrType (0, Cons) && 
] 	   IsChrType (current, SaraAm)) ||
] 	  (nChrs == 2 &&
] 	   IsChrType (cluster[0], Cons) &&
] 	   IsChrType (cluster[1], Tone) &&
] 	   IsChrType (current, SaraAm)))
] 	{
] 	  cluster[n_chars++] = current;
] 	  p = g_utf8_next_char (p);
] 	}
]       else
] 	break;
]     }
] 
] [...]
] 
]   /* Load a particular engine given the ID for the engine
]    */
]   PangoEngine *
]   MODULE_ENTRY(script_engine_load) (const char *id)
]   {
]     if (!strcmp (id, "ThaiScriptEngineLang")) {
]       return thai_engine_lang_new ();
]     } else if (!strcmp (id, "ThaiScriptEngineX")) {
]       return thai_engine_x_new ();
]     } else {
]       return NULL;
]     }
]   }
] 
] As a minor quibble, I'm not sure what the motivation was for adding
] the braces here; generally keeping non-significant changes to a
] minimum helps keep code review simpler.
] 
] 
] Anyways, the changes look sensible, though I certainly haven't
] verified the MAC/WIN shaping rules in detail. With a few cleanups as
] noted above, I don't see any problem with replacing the existing
] thai.c with it.
] 
] Regards,
]                                         Owen
] 
] _______________________________________________
] gtk-i18n-list mailing list
] gtk-i18n-list gnome org
] http://mail.gnome.org/mailman/listinfo/gtk-i18n-list
/* pANGO
 * thai.c:
 *
 * Copyright (C) 1999 Red Hat Software
 * Author: Owen Taylor <otaylor redhat com>
 *
 * Software and Language Engineering Laboratory, NECTEC
 * Author: Theppitak Karoonboonyanan <thep links nectec or th>
 *
 * Copyright (c) 1996-2000 by Sun Microsystems, Inc.
 * Author: Chookij Vanatham <Chookij Vanatham Eng Sun COM>
 *
 * This library is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Library General Public
 * License as published by the Free Software Foundation; either
 * version 2 of the License, or (at your option) any later version.
 *
 * This library is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
 * Library General Public License for more details.
 *
 * You should have received a copy of the GNU Library General Public
 * License along with this library; if not, write to the
 * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
 * Boston, MA 02111-1307, USA.
 */


#include <glib.h>
#include "pango.h"
#include "pangox.h"

#define ucs2tis(wc)		(unsigned int)((unsigned int)(wc) - 0x0E00 + 0xA0)
#define tis2uni(c)		((gunichar)(c) - 0xA0 + 0x0E00)

#define MAX_CLUSTER_CHRS	256
#define MAX_GLYPHS		256

/* Define TACTIS character classes */
#define CTRL			0
#define NON			1
#define CONS			2
#define LV			3
#define FV1			4
#define FV2			5
#define FV3			6
#define BV1			7
#define BV2			8
#define BD			9
#define TONE			10
#define AD1			11
#define AD2			12
#define AD3			13
#define AV1			14
#define AV2			15
#define AV3			16

#define _ND			0
#define _NC			1
#define _UC			(1<<1)
#define _BC			(1<<2)
#define _SC			(1<<3)
#define _AV			(1<<4)
#define _BV			(1<<5)
#define _TN			(1<<6)
#define _AD			(1<<7)
#define _BD			(1<<8)
#define _AM			(1<<9)

#define NoTailCons		_NC
#define UpTailCons		_UC
#define BotTailCons		_BC
#define SpltTailCons		_SC
#define Cons			(NoTailCons|UpTailCons|BotTailCons|SpltTailCons)
#define AboveVowel		_AV
#define BelowVowel		_BV
#define Tone			_TN
#define AboveDiac		_AD
#define BelowDiac		_BD
#define SaraAm			_AM

#define char_class(wc)		TAC_char_class[(unsigned int)(wc)]
#define is_char_type(wc, mask)	(char_type_table[ucs2tis ((wc))] & (mask))

/* We handle the range U+0e01 to U+0e5b exactly
 */
static PangoEngineRange thai_ranges[] = {
  { 0x0e01, 0x0e5b, "*" },  /* Thai */
};

static PangoEngineInfo script_engines[] = {
  {
    "ThaiScriptEngineLang",
    PANGO_ENGINE_TYPE_LANG,
    PANGO_RENDER_TYPE_NONE,
    thai_ranges, G_N_ELEMENTS(thai_ranges)
  },
  {
    "ThaiScriptEngineX",
    PANGO_ENGINE_TYPE_SHAPE,
    PANGO_RENDER_TYPE_X,
    thai_ranges, G_N_ELEMENTS(thai_ranges)
  }
};

/*
 * Language script engine
 */

static void 
thai_engine_break (const char     *text,
		    gint            len,
		    PangoAnalysis  *analysis,
		    PangoLogAttr   *attrs)
{
}

static PangoEngine *
thai_engine_lang_new ()
{
  PangoEngineLang *result;
  
  result = g_new (PangoEngineLang, 1);

  result->engine.id = "ThaiScriptEngine";
  result->engine.type = PANGO_ENGINE_TYPE_LANG;
  result->engine.length = sizeof (result);
  result->script_break = thai_engine_break;

  return (PangoEngine *)result;
}

/*
 * X window system script engine portion
 */

typedef struct _ThaiFontInfo ThaiFontInfo;

/* The type of encoding that we will use
 */
typedef enum {
  THAI_FONT_NONE,
  THAI_FONT_XTIS,
  THAI_FONT_TIS,
  THAI_FONT_TIS_MAC,
  THAI_FONT_TIS_WIN,
  THAI_FONT_ISO10646
} ThaiFontType;

struct _ThaiFontInfo
{
  PangoFont   *font;
  ThaiFontType type;
  PangoXSubfont subfont;
};

/* All combining marks for Thai fall in the range U+0E30-U+0E50,
 * so we confine our data tables to that range, and use
 * default values for characters outside those ranges.
 */

/* Map from code point to group used for rendering with XTIS fonts
 * (0 == base character)
 */
static const char groups[32] = {
  0, 1, 0, 0, 1, 1, 1, 1,
  1, 1, 1, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 2,
  2, 2, 2, 2, 2, 2, 1, 0
};

/* Map from code point to index within group 1
 * (0 == no combining mark from group 1)
 */   
static const char group1_map[32] = {
  0, 1, 0, 0, 2, 3, 4, 5,
  6, 7, 8, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0
};

/* Map from code point to index within group 2
 * (0 == no combining mark from group 2)
 */   
static const char group2_map[32] = {
  0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 0,
  0, 0, 0, 0, 0, 0, 0, 1,
  2, 3, 4, 5, 6, 7, 1, 0
};

static const gint char_type_table[256] = {
  /*       0,   1,   2,   3,   4,   5,   6,   7,
           8,   9,   A,   B,   C,   D,   E,   F  */

  /*00*/ _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
         _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
  /*10*/ _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
         _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
  /*20*/ _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
         _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
  /*30*/ _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
         _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
  /*40*/ _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
         _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
  /*50*/ _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
         _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
  /*60*/ _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
         _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
  /*70*/ _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
         _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
  /*80*/ _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
         _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
  /*90*/ _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
         _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
		
  /*A0*/ _ND, _NC, _NC, _NC, _NC, _NC, _NC, _NC,
         _NC, _NC, _NC, _NC, _NC, _SC, _BC, _BC,
  /*B0*/ _SC, _NC, _NC, _NC, _NC, _NC, _NC, _NC,
         _NC, _NC, _NC, _UC, _NC, _UC, _NC, _UC,
  /*C0*/ _NC, _NC, _NC, _NC, _ND, _NC, _ND, _NC,
         _NC, _NC, _NC, _NC, _UC, _NC, _NC, _ND,
  /*D0*/ _ND, _AV, _ND, _AM, _AV, _AV, _AV, _AV,
         _BV, _BV, _BD, _ND, _ND, _ND, _ND, _ND,
  /*E0*/ _ND, _ND, _ND, _ND, _ND, _ND, _ND, _AD,
         _TN, _TN, _TN, _TN, _AD, _AD, _AD, _ND,
  /*F0*/ _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
         _ND, _ND, _ND, _ND, _ND, _ND, _ND, _ND,
};

static const gint TAC_char_class[256] = {
  /*	   0,   1,   2,   3,   4,   5,   6,   7,
           8,   9,   A,   B,   C,   D,   E,   F  */

  /*00*/ CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,
         CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,
  /*10*/ CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,
         CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,
  /*20*/  NON, NON, NON, NON, NON, NON, NON, NON,
          NON, NON, NON, NON, NON, NON, NON, NON,
  /*30*/  NON, NON, NON, NON, NON, NON, NON, NON,
          NON, NON, NON, NON, NON, NON, NON, NON,
  /*40*/  NON, NON, NON, NON, NON, NON, NON, NON,
          NON, NON, NON, NON, NON, NON, NON, NON,
  /*50*/  NON, NON, NON, NON, NON, NON, NON, NON,
          NON, NON, NON, NON, NON, NON, NON, NON,
  /*60*/  NON, NON, NON, NON, NON, NON, NON, NON,
          NON, NON, NON, NON, NON, NON, NON, NON,
  /*70*/  NON, NON, NON, NON, NON, NON, NON, NON,
          NON, NON, NON, NON, NON, NON, NON,CTRL,
  /*80*/ CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,
         CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,
  /*90*/ CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,
         CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,CTRL,
  /*A0*/  NON,CONS,CONS,CONS,CONS,CONS,CONS,CONS,
         CONS,CONS,CONS,CONS,CONS,CONS,CONS,CONS,
  /*B0*/ CONS,CONS,CONS,CONS,CONS,CONS,CONS,CONS,
         CONS,CONS,CONS,CONS,CONS,CONS,CONS,CONS,
  /*C0*/ CONS,CONS,CONS,CONS, FV3,CONS, FV3,CONS,
         CONS,CONS,CONS,CONS,CONS,CONS,CONS, NON,
  /*D0*/  FV1, AV2, FV1, FV1, AV1, AV3, AV2, AV3,
          BV1, BV2,  BD, NON, NON, NON, NON, NON,
  /*E0*/   LV,  LV,  LV,  LV,  LV, FV2, NON, AD2,
         TONE,TONE,TONE,TONE, AD1, AD1, AD3, NON,
  /*F0*/  NON, NON, NON, NON, NON, NON, NON, NON,
          NON, NON, NON, NON, NON, NON, NON,CTRL,
};

static const gchar TAC_compose_and_input_check_type_table[17][17] = {
      /* Cn */ /* 0,   1,   2,   3,   4,   5,   6,   7,
                  8,   9,   A,   B,   C,   D,   E,   F       */
  /* Cn-1 00 */	'X', 'A', 'A', 'A', 'A', 'A', 'A', 'R',
                'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
  /* 10 */      'X', 'A', 'A', 'A', 'S', 'S', 'A', 'R',
                'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
  /* 20 */      'X', 'A', 'A', 'A', 'A', 'S', 'A', 'C',
                'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C',
  /* 30 */      'X', 'S', 'A', 'S', 'S', 'S', 'S', 'R',
                'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
  /* 40 */      'X', 'S', 'A', 'A', 'S', 'S', 'A', 'R',
                'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
  /* 50 */      'X', 'A', 'A', 'A', 'A', 'S', 'A', 'R',
                'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
  /* 60 */      'X', 'A', 'A', 'A', 'S', 'A', 'S', 'R',
                'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
  /* 70 */      'X', 'A', 'A', 'A', 'S', 'S', 'A', 'R',
                'R', 'R', 'C', 'C', 'R', 'R', 'R', 'R', 'R',
  /* 80 */      'X', 'A', 'A', 'A', 'S', 'S', 'A', 'R',
                'R', 'R', 'C', 'R', 'R', 'R', 'R', 'R', 'R',
  /* 90 */      'X', 'A', 'A', 'A', 'S', 'S', 'A', 'R',
                'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
  /* A0 */      'X', 'A', 'A', 'A', 'A', 'A', 'A', 'R',
                'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
  /* B0 */      'X', 'A', 'A', 'A', 'S', 'S', 'A', 'R',
                'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
  /* C0 */      'X', 'A', 'A', 'A', 'S', 'S', 'A', 'R',
                'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
  /* D0 */      'X', 'A', 'A', 'A', 'S', 'S', 'A', 'R',
                'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R', 'R',
  /* E0 */      'X', 'A', 'A', 'A', 'S', 'S', 'A', 'R',
                'R', 'R', 'C', 'C', 'R', 'R', 'R', 'R', 'R',
  /* F0 */      'X', 'A', 'A', 'A', 'S', 'S', 'A', 'R',
                'R', 'R', 'C', 'R', 'R', 'R', 'R', 'R', 'R',
                'X', 'A', 'A', 'A', 'S', 'S', 'A', 'R',
                'R', 'R', 'C', 'R', 'C', 'R', 'R', 'R', 'R'
};

typedef struct {
  gint ShiftDown_TONE_AD[8];
  gint ShiftDownLeft_TONE_AD[8];
  gint ShiftLeft_TONE_AD[8];
  gint ShiftLeft_AV[7];
  gint ShiftDown_BV_BD[3];
  gint TailCutCons[4];
} ThaiShapeTable;

#define shiftdown_tone_ad(c,tbl)      ((tbl)->ShiftDown_TONE_AD[(c)-0xE7])
#define shiftdownleft_tone_ad(c,tbl)  ((tbl)->ShiftDownLeft_TONE_AD[(c)-0xE7])
#define shiftleft_tone_ad(c,tbl)      ((tbl)->ShiftLeft_TONE_AD[(c)-0xE7])
#define shiftleft_av(c,tbl)           ((tbl)->ShiftLeft_AV[(c)-0xD1])
#define shiftdown_bv_bd(c,tbl)        ((tbl)->ShiftDown_BV_BD[(c)-0xD8])
#define tailcutcons(c,tbl)            ((tbl)->TailCutCons[(c)-0xAD])

/* Macintosh
 */
static const ThaiShapeTable Mac_shape_table = {
  { 0xE7, 0x88, 0x89, 0x8A, 0x8B, 0x8C, 0xED, 0xEE },
  { 0xE7, 0x83, 0x84, 0x85, 0x86, 0x87, 0x8F, 0xEE },
  { 0x93, 0x98, 0x99, 0x9A, 0x9B, 0x9C, 0x8F, 0xEE },
  { 0x92, 0x00, 0x00, 0x94, 0x95, 0x96, 0x97 },
  { 0xD8, 0xD9, 0xDA },
  { 0xAD, 0x00, 0x00, 0xB0 }
};

/* Microsoft Window
 */
static const ThaiShapeTable Win_shape_table = {
    { 0xE7, 0x8B, 0x8C, 0x8D, 0x8E, 0x8F, 0xED, 0xEE },
    { 0xE7, 0x86, 0x87, 0x88, 0x89, 0x8A, 0x99, 0xEE },
    { 0x9A, 0x9B, 0x9C, 0x9D, 0x9E, 0x9F, 0x99, 0xEE },
    { 0x98, 0x00, 0x00, 0x81, 0x82, 0x83, 0x84 },
    { 0xFC, 0xFD, 0xFE },
    { 0x90, 0x00, 0x00, 0x80 }
};

/* Returns a structure with information we will use to rendering given the
 * #PangoFont. This is computed once per font and cached for later retrieval.
 */
static ThaiFontInfo *
get_font_info (PangoFont *font)
{
  static const char *charsets[] = {
    "tis620-2",
    "tis620-1",
    "tis620-0",
    "xtis620.2529-1",
    "xtis-0",
    "tis620.2533-1",
    "tis620.2529-1",
    "iso8859-11",
    "iso10646-1",
  };

  static const int charset_types[] = {
    THAI_FONT_TIS_WIN,
    THAI_FONT_TIS_MAC,
    THAI_FONT_TIS,
    THAI_FONT_XTIS,
    THAI_FONT_XTIS,
    THAI_FONT_TIS,
    THAI_FONT_TIS,
    THAI_FONT_TIS,
    THAI_FONT_ISO10646
  };
  
  ThaiFontInfo *font_info;
  GQuark info_id = g_quark_from_string ("thai-font-info");
  
  font_info = g_object_get_qdata (G_OBJECT (font), info_id);

  if (!font_info)
    {
      /* No cached information not found, so we need to compute it
       * from scratch
       */
      PangoXSubfont *subfont_ids;
      gint *subfont_charsets;
      gint n_subfonts, i;

      font_info = g_new (ThaiFontInfo, 1);
      font_info->font = font;
      font_info->type = THAI_FONT_NONE;
      
      g_object_set_qdata_full (G_OBJECT (font), info_id, font_info, (GDestroyNotify)g_free);
      
      n_subfonts = pango_x_list_subfonts (font, (char **)charsets, G_N_ELEMENTS (charsets),
					  &subfont_ids, &subfont_charsets);

      for (i=0; i < n_subfonts; i++)
	{
	  ThaiFontType font_type = charset_types[subfont_charsets[i]];
	  
	  if (font_type != THAI_FONT_ISO10646 ||
	      pango_x_has_glyph (font, PANGO_X_MAKE_GLYPH (subfont_ids[i], 0xe01)))
	    {
	      font_info->type = font_type;
	      font_info->subfont = subfont_ids[i];
	      
	      break;
	    }
	}

      g_free (subfont_ids);
      g_free (subfont_charsets);
    }

  return font_info;
}

static void
add_glyph (ThaiFontInfo     *font_info, 
	   PangoGlyphString *glyphs, 
	   gint              cluster_start, 
	   PangoGlyph        glyph,
	   gboolean          combining)
{
  PangoRectangle ink_rect, logical_rect;
  gint index = glyphs->num_glyphs;

  pango_glyph_string_set_size (glyphs, index + 1);
  
  glyphs->glyphs[index].glyph = glyph;
  glyphs->glyphs[index].attr.is_cluster_start = combining ? 0 : 1;
  
  glyphs->log_clusters[index] = cluster_start;

  pango_font_get_glyph_extents (font_info->font,
				glyphs->glyphs[index].glyph, &ink_rect, &logical_rect);

  if (combining)
    {
      if (font_info->type == THAI_FONT_TIS ||
	  font_info->type == THAI_FONT_TIS_MAC ||
	  font_info->type == THAI_FONT_TIS_WIN)
	{
          glyphs->glyphs[index].geometry.width =
		logical_rect.width + glyphs->glyphs[index - 1].geometry.width;
          if (logical_rect.width > 0)
	      glyphs->glyphs[index].geometry.x_offset = glyphs->glyphs[index - 1].geometry.width;
          else
	      glyphs->glyphs[index].geometry.x_offset = glyphs->glyphs[index].geometry.width;
	  glyphs->glyphs[index - 1].geometry.width = 0;
	}
      else
        {
	  glyphs->glyphs[index].geometry.width =
		MAX (logical_rect.width, glyphs->glyphs[index - 1].geometry.width);
	  glyphs->glyphs[index - 1].geometry.width = 0;
	  glyphs->glyphs[index].geometry.x_offset = 0;
        }
    }
  else
    {
      glyphs->glyphs[index].geometry.x_offset = 0;
      glyphs->glyphs[index].geometry.width = logical_rect.width;
    }
  
  glyphs->glyphs[index].geometry.y_offset = 0;
}

static gint
get_adjusted_glyphs_list (ThaiFontInfo *font_info,
		                  gunichar *cluster,
		                  gint num_chrs,
		                  PangoGlyph **glyph_lists,
				  const ThaiShapeTable *shaping_table)
{
  switch (num_chrs)
    {
      case 1:
        if (is_char_type (cluster[0], BelowVowel|BelowDiac|AboveVowel|AboveDiac|Tone))
	  {
            glyph_lists[0] = PANGO_X_MAKE_GLYPH (font_info->subfont, 0x7F);
            glyph_lists[1] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
            return 2;
          }
	else
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
            return 1;
          }
        break;
        
      case 2:
        if (is_char_type (cluster[0], NoTailCons|BotTailCons|SpltTailCons) &&
            is_char_type (cluster[1], SaraAm))
	  {
            glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
            glyph_lists[1] = PANGO_X_MAKE_GLYPH (font_info->subfont, 0xED);
            glyph_lists[2] = PANGO_X_MAKE_GLYPH (font_info->subfont, 0xD2);
            return 3;
          }
	else if (is_char_type (cluster[0], UpTailCons) &&
          	 is_char_type (cluster[1], SaraAm))
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
            glyph_lists[1] = PANGO_X_MAKE_GLYPH (font_info->subfont,
					shiftleft_tone_ad (0xED, shaping_table));
            glyph_lists[2] = PANGO_X_MAKE_GLYPH (font_info->subfont, 0xD2);
            return 3;
          }
	else if (is_char_type (cluster[0], NoTailCons|BotTailCons|SpltTailCons) &&
          	 is_char_type (cluster[1], AboveVowel))
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
	    glyph_lists[1] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[1]));
	    return 2;
          }
	else if (is_char_type (cluster[0], NoTailCons|BotTailCons|SpltTailCons) &&
          	 is_char_type (cluster[1], AboveDiac|Tone))
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
	    glyph_lists[1] = PANGO_X_MAKE_GLYPH (font_info->subfont,
		    	shiftdown_tone_ad (ucs2tis (cluster[1]), shaping_table));
	    return 2;
	  }
	else if (is_char_type (cluster[0], UpTailCons) &&
          	 is_char_type (cluster[1], AboveVowel))
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
	    glyph_lists[1] = PANGO_X_MAKE_GLYPH (font_info->subfont,
		    	shiftleft_av (ucs2tis (cluster[1]), shaping_table));
	    return 2;
          }
	else if (is_char_type (cluster[0], UpTailCons) &&
          	 is_char_type (cluster[1], AboveDiac|Tone))
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
	    glyph_lists[1] = PANGO_X_MAKE_GLYPH (font_info->subfont,
			shiftdownleft_tone_ad (ucs2tis (cluster[1]), shaping_table));
	    return 2;
	  }
	else if (is_char_type (cluster[0], NoTailCons|UpTailCons) &&
          	 is_char_type (cluster[1], BelowVowel|BelowDiac))
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
	    glyph_lists[1] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[1]));
	    return 2;
          }
	else if (is_char_type (cluster[0], BotTailCons) &&
		 is_char_type (cluster[1], BelowVowel|BelowDiac))
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
	    glyph_lists[1] =
		PANGO_X_MAKE_GLYPH (font_info->subfont,
			shiftdown_bv_bd (ucs2tis (cluster[1]), shaping_table));
	    return 2;
	  }
	else if (is_char_type (cluster[0], SpltTailCons) &&
          	 is_char_type (cluster[1], BelowVowel|BelowDiac))
	  {
	    glyph_lists[0] = PANGO_X_MAKE_GLYPH (font_info->subfont,
		    		tailcutcons (ucs2tis (cluster[0]), shaping_table));
	    glyph_lists[1] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[1]));
	    return 2;
	  }
	else
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, 0x7F);
	    glyph_lists[1] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[1]));
	    glyph_lists[2] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[2]));
	    return 3;
	  }
        break;
          
      case 3:
        if (is_char_type (cluster[0], NoTailCons|BotTailCons|SpltTailCons) &&
            is_char_type (cluster[1], Tone) &&
            is_char_type (cluster[2], SaraAm))
	  {
            glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
            glyph_lists[1] = PANGO_X_MAKE_GLYPH (font_info->subfont, 0xED);
            glyph_lists[2] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[1]));
            glyph_lists[3] = PANGO_X_MAKE_GLYPH (font_info->subfont, 0xD2);
            return 4;
          }
	else if (is_char_type (cluster[0], UpTailCons) &&
		 is_char_type (cluster[1], Tone) &&
		 is_char_type (cluster[2], SaraAm))
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
	    glyph_lists[1] = PANGO_X_MAKE_GLYPH (font_info->subfont,
				shiftleft_tone_ad (0xED, shaping_table));
	    glyph_lists[2] = PANGO_X_MAKE_GLYPH (font_info->subfont,
		    		shiftleft_tone_ad (ucs2tis (cluster[1]), shaping_table));
	    glyph_lists[3] = PANGO_X_MAKE_GLYPH (font_info->subfont, 0xD2);
	    return 4;
	  }
	else if (is_char_type (cluster[0], UpTailCons) &&
		 is_char_type (cluster[1], AboveVowel) &&
		 is_char_type (cluster[2], AboveDiac|Tone))
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
	    glyph_lists[1] = PANGO_X_MAKE_GLYPH (font_info->subfont,
				shiftleft_av (ucs2tis (cluster[1]), shaping_table));
	    glyph_lists[2] = PANGO_X_MAKE_GLYPH (font_info->subfont,
		    		shiftleft_tone_ad (ucs2tis (cluster[2]), shaping_table));
	    return 3;
	  }
	else if (is_char_type (cluster[0], UpTailCons) &&
		 is_char_type (cluster[1], BelowVowel) &&
		 is_char_type (cluster[2], AboveDiac|Tone))
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
	    glyph_lists[1] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[1]));
	    glyph_lists[2] = PANGO_X_MAKE_GLYPH (font_info->subfont,
			shiftdownleft_tone_ad (ucs2tis (cluster[2]), shaping_table));
	    return 3;
	  }
	else if (is_char_type (cluster[0], NoTailCons) &&
		 is_char_type (cluster[1], BelowVowel) &&
		 is_char_type (cluster[2], AboveDiac|Tone))
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
	    glyph_lists[1] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[1]));
	    glyph_lists[2] =
		PANGO_X_MAKE_GLYPH (font_info->subfont,
			shiftdown_tone_ad (ucs2tis (cluster[2]), shaping_table));
	    return 3;
	  }
	else if (is_char_type (cluster[0], SpltTailCons) &&
		 is_char_type (cluster[1], BelowVowel) &&
		 is_char_type (cluster[2], AboveDiac|Tone))
	  {
	    glyph_lists[0] = PANGO_X_MAKE_GLYPH (font_info->subfont,
		    		tailcutcons (ucs2tis (cluster[0]), shaping_table));
	    glyph_lists[1] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[1]));
	    glyph_lists[2] = PANGO_X_MAKE_GLYPH (font_info->subfont,
		    		shiftdown_tone_ad (ucs2tis (cluster[2]), shaping_table));
	    return 3;
	  }
	else if (is_char_type (cluster[0], BotTailCons) &&
		 is_char_type (cluster[1], BelowVowel) &&
		 is_char_type (cluster[2], AboveDiac|Tone))
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
	    glyph_lists[1] = PANGO_X_MAKE_GLYPH (font_info->subfont,
		    		shiftdown_bv_bd (ucs2tis (cluster[1]), shaping_table));
	    glyph_lists[2] = PANGO_X_MAKE_GLYPH (font_info->subfont,
				shiftdown_tone_ad (ucs2tis (cluster[2]), shaping_table));
	    return 3;
	  }
	else
	  {
	    glyph_lists[0] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[0]));
	    glyph_lists[1] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[1]));
	    glyph_lists[2] =
		PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[2]));
	    return 3;
	  }
      break;
    }

    return 0;
}

static gint
get_glyphs_list (ThaiFontInfo	*font_info,
		 gunichar	*cluster,
		 gint		num_chrs,
		 PangoGlyph	**glyph_lists)
{
  PangoGlyph glyph;
  gint xtis_index;
  gint i;

  switch (font_info->type)
    {
      case THAI_FONT_NONE:
        for (i=0; i < num_chrs; i++)
	  glyph_lists[i] = pango_x_get_unknown_glyph (font_info->font);
        return num_chrs;

      case THAI_FONT_XTIS:
        /* If we are rendering with an XTIS font, we try to find a precomposed
         * glyph for the cluster.
         */
        xtis_index = 0x100 * (cluster[0] - 0xe00 + 0x20) + 0x30;
        if (cluster[1])
	    xtis_index +=8 * group1_map[cluster[1] - 0xe30];
        if (cluster[2])
	    xtis_index += group2_map[cluster[2] - 0xe30];
        glyph = PANGO_X_MAKE_GLYPH (font_info->subfont, xtis_index);
        if (pango_x_has_glyph (font_info->font, glyph)) {
            glyph_lists[0] = glyph;
            return 1;
        }
        for (i=0; i < num_chrs; i++)
	  glyph_lists[i] =
	      PANGO_X_MAKE_GLYPH (font_info->subfont,
					0x100 * (cluster[i] - 0xe00 + 0x20) + 0x30);
        return num_chrs;
      
      case THAI_FONT_TIS:
        for (i=0; i < num_chrs; i++)
          glyph_lists[i] =
              PANGO_X_MAKE_GLYPH (font_info->subfont, ucs2tis (cluster[i]));
        return num_chrs;
      
      case THAI_FONT_TIS_MAC:
	/* MacIntosh Extension
	 */
        return get_adjusted_glyphs_list (font_info, cluster,
			num_chrs, glyph_lists, &Mac_shape_table);
      
      case THAI_FONT_TIS_WIN:
	/* Microsoft Extension
	 */
        return get_adjusted_glyphs_list (font_info, cluster,
			num_chrs, glyph_lists, &Win_shape_table);
      
      case THAI_FONT_ISO10646:
        for (i=0; i < num_chrs; i++)
          glyph_lists[i] = PANGO_X_MAKE_GLYPH (font_info->subfont, cluster[i]);
        return num_chrs;
    }

  return 0;			/* Quiet GCC */
}

static void
add_cluster (ThaiFontInfo	*font_info,
	     PangoGlyphString	*glyphs,
	     gint		cluster_start,
	     gunichar		*cluster,
	     gint		num_chrs)
	     
{
  PangoGlyph glyphs_list[MAX_GLYPHS];
  gint num_glyphs;
  gint i;
  
  num_glyphs = get_glyphs_list(font_info, cluster, num_chrs, &glyphs_list);
  for (i=0; i<num_glyphs; i++)
       add_glyph (font_info, glyphs, cluster_start, glyphs_list[i],
	    		i == 0 ? FALSE : TRUE);
}

static gboolean
is_wtt_composible (gunichar cur_wc, gunichar nxt_wc)
{
  switch (TAC_compose_and_input_check_type_table[char_class (ucs2tis (cur_wc))]
						[char_class (ucs2tis (nxt_wc))])
    {
      case 'A':
      case 'S':
      case 'R':
      case 'X':
        return FALSE;

      case 'C':
        return TRUE;
    }
}

static const char *
get_next_cluster(const char	*text,
		 gint		length,
		 gunichar	**cluster,
		 gint		*num_chrs)
{  
  const char *p;
  gint n_chars = 0;
  
  p = text;
  while (p < text + length && n_chars < 3)  
    {
      gunichar current = g_utf8_get_char (p);
      
      if (n_chars == 0 ||
	  is_wtt_composible ((gunichar)(cluster[n_chars - 1]), current) ||
	  (n_chars == 1 &&
	   is_char_type (cluster[0], Cons) && 
	   is_char_type (current, SaraAm)) ||
	  (n_chars == 2 &&
	   is_char_type (cluster[0], Cons) &&
	   is_char_type (cluster[1], Tone) &&
	   is_char_type (current, SaraAm)))
	{
	  cluster[n_chars++] = current;
	  p = g_utf8_next_char (p);
	}
      else
	break;
    }

  *num_chrs = n_chars;
  return p;
}

static void 
thai_engine_shape (PangoFont        *font,
		   const char       *text,
		   gint              length,
		   PangoAnalysis    *analysis,
		   PangoGlyphString *glyphs)
{
  ThaiFontInfo *font_info;
  const char *p;
  const char *log_cluster;
  gunichar cluster[MAX_CLUSTER_CHRS];
  gint num_chrs;

  pango_glyph_string_set_size (glyphs, 0);

  font_info = get_font_info (font);

  p = text;
  while (p < text + length)
    {
	log_cluster = p;
	p = get_next_cluster (p, text + length - p, &cluster, &num_chrs);
	add_cluster (font_info, glyphs, log_cluster - text, cluster, num_chrs);
    }
}

static PangoCoverage *
thai_engine_get_coverage (PangoFont  *font,
			   const char *lang)
{
  PangoCoverage *result = pango_coverage_new ();
  
  ThaiFontInfo *font_info = get_font_info (font);
  
  if (font_info->type != THAI_FONT_NONE)
    {
      gunichar wc;
      
      for (wc = 0xe01; wc <= 0xe3a; wc++)
	pango_coverage_set (result, wc, PANGO_COVERAGE_EXACT);
      for (wc = 0xe3f; wc <= 0xe5b; wc++)
	pango_coverage_set (result, wc, PANGO_COVERAGE_EXACT);
    }

  return result;
}

static PangoEngine *
thai_engine_x_new ()
{
  PangoEngineShape *result;
  
  result = g_new (PangoEngineShape, 1);

  result->engine.id = "ThaiScriptEngine";
  result->engine.type = PANGO_ENGINE_TYPE_SHAPE;
  result->engine.length = sizeof (result);
  result->script_shape = thai_engine_shape;
  result->get_coverage = thai_engine_get_coverage;

  return (PangoEngine *)result;
}

/* The following three functions provide the public module API for
 * Pango. If we are compiling it is a module, then we name the
 * entry points script_engine_list, etc. But if we are compiling
 * it for inclusion directly in Pango, then we need them to
 * to have distinct names for this module, so we prepend
 * _pango_thai_
 */
#ifdef MODULE_PREFIX
#define MODULE_ENTRY(func) _pango_thai_##func
#else
#define MODULE_ENTRY(func) func
#endif

/* List the engines contained within this module
 */
void 
MODULE_ENTRY(script_engine_list) (PangoEngineInfo **engines, gint *n_engines)
{
  *engines = script_engines;
  *n_engines = G_N_ELEMENTS (script_engines);
}

/* Load a particular engine given the ID for the engine
 */
PangoEngine *
MODULE_ENTRY(script_engine_load) (const char *id)
{
  if (!strcmp (id, "ThaiScriptEngineLang"))
    return thai_engine_lang_new ();
  else if (!strcmp (id, "ThaiScriptEngineX"))
    return thai_engine_x_new ();
  else
    return NULL;
}

void 
MODULE_ENTRY(script_engine_unload) (PangoEngine *engine)
{
}



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]