Re: [g-a-devel] Happy patch bonanza
- From: Milan Zamazal <pdm brailcom org>
- To: Gnome-accessibility-devel gnome org
- Cc: ubuntu-accessibility lists ubuntu com
- Subject: Re: [g-a-devel] Happy patch bonanza
- Date: Thu, 29 Jun 2006 12:53:55 +0200
>>>>> "BH" == Bill Haneman <Bill Haneman Sun COM> writes:
BH> So it seems a more general/robust method is needed for
BH> determining the correct encoding for the output channel. For
BH> some voices it's apparently UTF-8, whereas for most european
BH> voices it's "latin 1". Presumably some languages may need
BH> latin2, etc. instead...
Yes. IMO a reasonable approach is to use the coding declared by the
voice and to use ISO-8859-1 if the voice doesn't declare its coding.
This is what festival-freebsoft-utils does.
Preferably all voices should declare their coding. There's no standard
way to do that in Festival; festival-freebsoft-utils introduces just
another item in the voice declaration called `coding' for that purpose.
It's trivial to add it and it's IMHO better than introducing new
configuration options to all the Festival frontends.
The festival-freebsoft-utils current-voice-coding function is trivial:
(or (cadr (assoc 'coding (cadr (voice.description current-voice))))
If all you need from festival-freebsoft-utils is this function then
there's no need to require the whole festival-freebsoft-utils package to
be able to figure out the voice coding.
BH> Actually, festival _is_ UTF-8 capable, at least for some voices.
It is not. The UTF-8 voices handle the UTF-8 input as a sequence of
8-bit characters. Of course this is far from being comfortable and one
can't use many standard Festival functions on such an input. So UTF-8
is used in Festival only for languages which can't represent their
character set in an 8-bit coding.
Of course, the best way would be to make Festival work with Unicode
characters. But I think this is a non-trivial task and apparently
nobody works on it. So I'd suggest to use the `coding' voice property
workaround described above for now.
BH> I still think ISO-8859-1 might be a better 'default' for the
BH> festival driver than UTF-8, since as far as I know none of the
BH> european voices expect UTF-8 input.
] [Thread Prev