Hi Luke:
First of all, I say "Hear, hear!" The audio windmill is something
people have been charging at for a long time. Users who rely upon
speech synthesis working correctly and integrating well with the rest of
their environment are among those that need reliable audio support most
critically.
I see two main proposals in the below:
1) Modify gnome-speech drivers to obtain samples from their
speech engines and then handle the audio playing themselves.
This is different from the current state where the
gnome-speech driver expects the speech engine to do all the
audio management.
This sounds like an interesting proposal. I can tell you
for sure, though, that the current gnome-speech maintainer
has his hands full with other things (e.g., leading Orca).
So, the work would need to come from the community.
2) As part of #1, move to an API that is pervasive on the system.
The proposed API is GStreamer.
Moving to a pervasive API is definitely very interesting, and
I would encourage looking at a large set of platforms: Linux
to Solaris, GNOME to KDE, etc. An API of recent interest is
Pulse Audio (https://wiki.ubuntu.com/PulseAudio), which might
be worth watching. I believe there might be many significant
improvements in the works for OSS as well.
In the bigger scheme of things, however, there is discussion of
deprecating Bonobo. Bonobo is used by gnome-speech to activate
gnome-speech drivers. As such, one might consider alternatives to
gnome-speech. For example, SpeechDispatcher
(http://www.freebsoft.org/speechd) or TTSAPI
(http://www.freebsoft.org/tts-api-provider) might be something to
consider. They are not without issue, however. Some of the issues
include cumbersome configuration, reliability, etc. I believe that's
all solvable with work. The harder issue in my mind is that they will
introduce an external dependency for things like GNOME, and I've also
not looked at what their licensing scheme is.
Will