Re: Putback thai.c to CVS
- From: Chookij Vanatham <chookij vanatham eng sun com>
- To: gtk-i18n-list gnome org, otaylor redhat com
- Cc: chookij vanatham eng sun com
- Subject: Re: Putback thai.c to CVS
- Date: Tue, 14 Nov 2000 12:13:21 -0800 (PST)
Hi Owen,
Thanks for the review.
I'll update the fix.
Regarding the main question about 0x7F glyph, it's the black box glyph which
defines from Thai microsoft window for the case of illegal sequence.
It's zero-width and negative advance witdth. Then, in Thai MS, what they
want to display is just displaying this black box glyph overwrite or on top
of the previous cluster (cell).
We don't think this is the good idea to display the Black box like that,
so, we change the glyph to be like "blank or space" with positive advance width
and non-zero width. The way, the display will look better.
Not quite sure about what your concern is. Let me know more detail and I'll
check it out.
Thanks,
Chookij V.
] What do these fonts have in position 0x7f? If it is a character with negative
] advance width, I'm not sure Pango will handle this properly, but from
] the context it also seems like it might be a blank character of the width
] of the cell.
] Date: Tue, 14 Nov 2000 13:15:26 -0500
] From: Owen Taylor <otaylor redhat com>
] Subject: Re: Putback thai.c to CVS
] To: gtk-i18n-list gnome org
] Cc: Chookij Vanatham <chookij vanatham eng sun com>
] MIME-version: 1.0
] User-Agent: Gnus/5.0807 (Gnus v5.8.7) Emacs/20.7
]
]
] This looks good, but there are a few things I'd like to see fixed
] before it is committed to CVS.
]
] I think it is very important that each module have a consistent
] identation style. This is for two reasons:
]
] - Improve legibility
] - Prevent excessive diffs as each person editing the
] file reformats it to their own taste.
]
] I'd strongly encourage using the Pango coding style for this
] (pango/docs/TEXT/coding-style) for this, but it is only mandatory for
] the Pango library and examples.
]
] The thai.c module previously followed the Pango guidelines; if you
] need to use something else, please make the whole file use that
] consistently. Currently, there are places where within in a single
] block there are two different indentation levels...
]
]
] Additional detailed comments on parts of the file:
]
] * Copyright (C) 1999 Red Hat Software
]
] Since you've added significant amounts of code, you should add the
] appropriate copyright notice here. ("Copyright 2000, Sun
] Microsystems", or whatever.) Adding or not adding the copyright notice
] does not change the copyright status of the code, but it makes it
] clear who the copyright holders of the code are.
]
] #include <iconv.h>
] [...]
] #include <fribidi/fribidi.h>
]
] The includes of iconv and fribidi were never necessary for thai.c,
] and actually now will cause compilation errors since I've removed the
] direct dependencies here.
]
]
] #define NoTailCons _NC
] #define UpTailCons _UC
] #define BotTailCons _BC
] #define SpltTailCons _SC
] #define Cons
(NoTailCons|UpTailCons|BotTailCons|SpltTailCons)
] #define AboveVowel _AV
] #define BelowVowel _BV
] #define Tone _TN
] #define AboveDiac _AD
] #define BelowDiac _BD
] #define SaraAm _AM
]
] It seems that these #defines are meant to be legible and the _NC-style
] short forms just meant for the table; I think it would make the code
] more legible if you used the long #defines everywhere but the table.
]
] /* Returns a structure with information we will use to rendering given the
] * #PangoFont. This is computed once per font and cached for later
retrieval.
] */
] static ThaiFontInfo *
] get_font_info (PangoFont *font)
] {
] static const char *charsets[] = {
] "xtis620.2529-1",
] "xtis-0",
] "tis620.2533-1",
] "tis620.2529-1",
] "iso8859-11",
] "tis620-0",
] "tis620-1",
] "tis620-2",
] "iso10646-1",
] };
]
] Is the ordering here still right? I would expect that tis620-1 and tis620-2
] should go at the top, since they should provide as good results as the
] XTIS fonts, but with a more compact set of glyphs. They certainly will
] provide better results than the plain-tis fonts, so should be preferred
] to them.
]
] (The reason I decided to do a XTIS shaper was mostly because it made a
] nice simple example, not because I really think people should have
] fully-precomposed XTIS fonts...)
]
] switch (num_chrs) {
] case 1:
] if (IsChrType(cluster[0], _BV|_BD|_AV|_AD|_TN)) {
] glyph_lists[0] =
] PANGO_X_MAKE_GLYPH (font_info->subfont, 0x7F);
] glyph_lists[1] =
] PANGO_X_MAKE_GLYPH (font_info->subfont,
] ucs2tis(cluster[0]));
] return 2;
]
] What do these fonts have in position 0x7f? If it is a character with negative
] advance width, I'm not sure Pango will handle this properly, but from
] the context it also seems like it might be a blank character of the width
] of the cell.
]
] /* Return the glyph code within the font for the given Unicode Thai
] * code pointer
] */
] get_glyph (ThaiFontInfo * font_info, gunichar wc)
]
] The return value for this function seems to have been lost, but since
] you no longer are using it, it should just be removed.
]
] gboolean
] IsWttComposible(gint cur_wc, gint nxt_wc)
]
] This function should be static. Keeping exported symbols to a minimum
] prevents various problems.
]
] static char *
] g_utf8_get_next_cluster(const char *text,
] gint length,
] gunichar **cluster,
] gint *num_chrs)
]
] I think this name is less than ideal - since it is a static function it
doesn't
] need to be namespaced. It definitely shouldn't be namespaced within
] the namespace of GLib UTF-8 manipulation functions. get_next_cluster()
] would be fine.
]
] if ((text == NULL) || (length == 0) ||
] ((text) && (*text == '\0')) ) {
] if (*num_chrs)
] *num_chrs = 0;
] if (cluster)
] cluster[0] = 0;
] return (char *)NULL;
] }
]
] If this check was ever hit, it would cause really bad results, since you
] are running this function within a loop
]
] while (p < text + length)
] {
] gunichar cluster[MAX_CLUSTER_CHRS];
] gint num_chrs;
]
] log_cluster = p;
] p = g_utf8_get_next_cluster(p, text + length - p, &cluster,
&num_chrs);
] add_cluster(font_info, glyphs, log_cluster - text, cluster, num_chrs);
] }
]
] So having get_next_cluster return NULL is not a good idea. However,
] the check will never be hit since a shaper is allowed to assume that
] it will be called with a non-NULL text with an accurate length.
]
] In general, I would tend to avoid this sort of "something went wrong,
] let me silently ignore it" check. Either you should simply assume that
] your arguments or valid, or if you check their validity you should do
] so in a way that you get useful debugging information if the check
] fails - the g_assert() and g_return_if_fail() macros are useful for
] this.
]
] p = text;
] if (cluster)
] cluster[0] = g_utf8_get_char(p);
]
] Again, there is no point in trying to behave properly if someone calls
] you with cluster == NULL. Either assume that the arguments are right
] (after all, the function is called only once from immediately below)
] or put at the top of the function:
]
] g_assert (cluster != NULL);
]
] What I expect you were copying here is that for public functions we
] generally allow the pointer passed in for out arguments to be NULL;
] however, that doesn't really apply to a static "helper" function that
] occurs only in one place.
]
] p = g_utf8_next_char(p);
] do {
] if (p >= text + length)
] break;
] if (cluster)
] cluster[nChrs] = g_utf8_get_char(p);
] if (IsWttComposible(cluster[nChrs - 1], cluster[nChrs])) {
] nChrs++;
] if (nChrs == 3)
] ClusterNotFound = FALSE;
] p = g_utf8_next_char(p);
] } else {
] if ((nChrs == 1) &&
] IsChrType(cluster[nChrs - 1], Cons) &&
] IsChrType(cluster[nChrs], SaraAm) ) {
] nChrs = 2;
] p = g_utf8_next_char(p);
] } else if ((nChrs == 2) &&
] IsChrType(cluster[nChrs - 2], Cons) &&
] IsChrType(cluster[nChrs - 1], Tone) &&
] IsChrType(cluster[nChrs], SaraAm) ) {
] nChrs = 3;
] p = g_utf8_next_char(p);
] }
] ClusterNotFound = FALSE;
] }
] } while (ClusterNotFound);
]
] I find the above a little confusing with the update of p and nChrs all
] over the place; can't you do something like the following?
]
] while (p < text + length && n_chars < 3)
] {
] gunichar current = g_utf8_get_char (p);
]
] if (n_chars = 0 ||
] IsWttComposible (cluster[n_chars - 1], current) ||
] (nChrs == 1 &&
] IsChrType (0, Cons) &&
] IsChrType (current, SaraAm)) ||
] (nChrs == 2 &&
] IsChrType (cluster[0], Cons) &&
] IsChrType (cluster[1], Tone) &&
] IsChrType (current, SaraAm)))
] {
] cluster[n_chars++] = current;
] p = g_utf8_next_char (p);
] }
] else
] break;
] }
]
] [...]
]
] /* Load a particular engine given the ID for the engine
] */
] PangoEngine *
] MODULE_ENTRY(script_engine_load) (const char *id)
] {
] if (!strcmp (id, "ThaiScriptEngineLang")) {
] return thai_engine_lang_new ();
] } else if (!strcmp (id, "ThaiScriptEngineX")) {
] return thai_engine_x_new ();
] } else {
] return NULL;
] }
] }
]
] As a minor quibble, I'm not sure what the motivation was for adding
] the braces here; generally keeping non-significant changes to a
] minimum helps keep code review simpler.
]
]
] Anyways, the changes look sensible, though I certainly haven't
] verified the MAC/WIN shaping rules in detail. With a few cleanups as
] noted above, I don't see any problem with replacing the existing
] thai.c with it.
]
] Regards,
] Owen
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]