Re: [g-a-devel] Happy patch bonanza (more patch bonanza)



Hi All:

After a couple cups of coffee on a jet-lagged Saturday morning, I've read and digested all the e-mail on this subject. Since I'm one of those damn self-centered ASCII Americans, I don't always completely understand the full impact of all the internationalization and localization problems, especially when it comes to this particular mix of various processes and libraries.

Let me make sure I understand the proposal here:

1) IN FESTIVAL: Rely on a convention to optionally extend the programmatic description of Festival voices directly in the Festival voice data itself (i.e., not in gnome-speech). Based upon precedence set by the festival-freebsoft-utils folks, this extension adds a "coding" attribute to define the character encoding type of the voice. The "coding" attribute is a string acceptable for passing directly to g_io_channel_set_encoding. ISO-8859-1 is implied if "coding" is absent. See the "current-voice-coding" description at <http://www.freebsoft.org/doc/festival-freebsoft-utils/festival- freebsoft-utils_13.html> for more information on the "coding" attribute.

2) IN GNOME-SPEECH: Patch the gnome-speech festival synthesis driver to check for the "coding" attribute of a voice description. If the parameter is defined, call g_io_channel_set_encoding with the value of the attribute. If it is not set, default to ISO-8859-1.

This sounds simple enough to me. I may be misunderstanding something in one of the threads on this topic, but it seems that it is implied that the user will be setting the character encoding for their desktop to be the same as that of their synthesis engine/voice and visa versa. Should some sort of transcoding/conversion be attempted if there is a detected mismatch, or is this automatically handled by the g_io infrastructure?

In addition, the obvious impact here is on our Telugu (festival-te.sf.net) and other UTF-8 language friends - they would need to extend the relevant festival voices to set the "coding" parameter to UTF-8 and also help test this.

Please let me know if I'm understanding this correctly. In addition, many many thanks to both Enrico and Milan for their understanding and diligence in this matter. You definitely help define what "community" means.

Will

PS - I'm out of the office for the next several days, but I will release a new gnome-speech tarball for the next GNOME 2.15 deadline (12-July) if we can quickly reach closure on this.

On Jun 29, 2006, at 12:27 PM, Enrico Zini wrote:

On Thu, Jun 29, 2006 at 05:53:49PM +0200, Milan Zamazal wrote:

    EZ>      (coding "ISO-8859-1")))
Yes, except that it's probably better to specify the coding without
double quotes: (coding ISO-8859-1)

Done.  I'm not proficient with LISP: what is the difference?

EZ> Because if we're inventing it right now, then I think I'd prefer
    EZ> "encoding".
Well, I'm not sure which of the two English terms better fits the
context.

'coding' is ok with me, if it's already used somewhere.

Yes, this is the right thing to do.

Good!
People, please review the attached patch for gnome-speech to take
advantage of the 'coding' attribute.


Ciao,

Enrico

--
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini <enrico debian org>
<recode1.patch>_______________________________________________
Gnome-accessibility-devel mailing list
Gnome-accessibility-devel gnome org
http://mail.gnome.org/mailman/listinfo/gnome-accessibility-devel




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]