Re: Fixing gnome-speech



Hello,

I'd like to address a few points.

* First, as we discussed on accessibility freedesktop org (if someone is
not subscribed, you are welcome to join), we want to create a new API to
access speech synthesis. This shouldn't be looked at as "yet-another"
speech API. Rather, we did some prototypes in Gnome Speech, Speech
Dispatcher and KTTSD and found some dead ends and some new requirements.
Also, in the Speech Dispatcher especially we found the most clean way to
proceed forwards is to split it into two separate parts: one message
handling and prioritization and the second interface with speech
engines. So with a fresh mind, several people were working on putting
down our common (Brailcom projects, Speakup, Gnome, KDE) requirements on
such speech API. This document is fairly complete by now and we are at
the point when we are starting implementation.

The most beneficial way how to contribute to speech synthesis support
right now is to help with TTS API and when the infrastructure is in
place, develop modules for TTS API. (Not that we will rewrite all
modules, I already know the existing Dispatcher modules will require
only minor modifications in short term.)

This doesn't address the problem of how Gnome applications should
interface with it. Either Gnome Speech could be modified to use TTS API
or the applications go through some other tool like Speech Dispatcher.
I think an Orca module for Speech Dispatcher makes very much sense.

An important thing is that both projects are desktop independent.

* Another thing several people asked were dates. As for the Orca module
for Dispatcher, Tomas already answered the question. The TTS API
implementation we hope to have finished in time for the KDE developers
to connect KTTS with Speech Dispatcher for KDE4. Also, next major Speech
Dispatcher release will already work on top of TTS API and several
major improvements will be made to its interface (SSIP). Of course, this
is all hopes.

* Enrico suggested we should use Festival C API instead of talking
to it via TCP. Also Olivier mentioned the whole chain to be too long and
source of troubles. However, I suspect the problem is not in the chain
being too long as much as in both Festival and Gnome Speech lacking
proper detailed logs.

In Speech Dispatcher we also use Festival via TCP (actually Gnome Speech
doesn't, it runs the binary) and to my experience, this is a
good advantage for debugging. It is very easy to log the communication
with Festival, so for the developer it is easy to see what went wrong if
something does. It is also easy to send very informative bug reports.
Also, we have found the connection randomly crashes for no apparent
reason. It is indeed far better if we can just detect it, log it and
create a new connection and reset the parameters automatically (as we do
now) than if such a crash would bring down the whole module (if we were
using the C API) for no clear reason. (Another one: in the current
version of Dispatcher, sometimes a very mysterious segfault happens.
I suspect this has something to do with ALSA, but it is very hard to
tell as we link ALSA directly and the crash is not reproducible in
testing circumstances...)

Now, one of the big problems is that Festival doesn't offer proper logs.
It would often refuse connection for a stupid typo in the configuration
file and not give any clue to the user. This is something which should
be fixed.

* The generic output module proved to be very useful. But I must object
to the claim that it can do mostly everything. It can talk and provide
the basic level of synchronization information necessary for screen
readers, but it doesn't support any more advanced things. Users don't
notice with Speakup because Speakup doesn't use more advanced
capabilities of the synthesizer, but you would surely very soon notice
the difference with Festival in clients like speechd-el which use its
full power. A native module is much better when someone does it. But
TTS API will provide a generic module too and I already have a list of
things which I'd like to improve.

* I hope the Cepstrall/Swift and FreeTTS modules will be ported under
TTS API eventually. At least this was the intention. Have an API that
doesn't limit anyone and move everything there to a common code base
which we can mantain together.

Thanks for attention. I apologize for a long post.

With regards,
Hynek Hanke




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]