Re: Thoughts on speech
- From: Willie Walker <William Walker Sun COM>
- To: Steve Lee <steve fullmeasure co uk>
- Cc: David Bolter <dtb gnome org>, Gnome Accessibility List <gnome-accessibility-list gnome org>
- Subject: Re: Thoughts on speech
- Date: Fri, 07 Mar 2008 09:05:49 -0500
Hi Steve:
The IDL for the main speaker interface is here (it's pretty short/simple):
http://svn.gnome.org/svn/gnome-speech/trunk/idl/GNOME_Speech_Speaker.idl
The main driver interface, which is basically what you discover when
using Bonobo, is here:
http://svn.gnome.org/svn/gnome-speech/trunk/idl/GNOME_Speech_SynthesisDriver.idl
"Extra" stuff, such as pitch, rate, punctuation, etc., is buried in the
parameter information. You need to hunt around to find it, and it's all
handled by convention versus a real spec. An example of a more complete
set of parameters can be found at the end of this file:
http://svn.gnome.org/svn/gnome-speech/trunk/drivers/viavoice/viavoicespeaker.c
Hope this helps!
Will
Steve Lee wrote:
Brilliant Will, I've just leant a whole lot, thanks
What is the main API for speech? Something like speak( sText ) ?
Steve
On 07/03/2008, Willie Walker <William Walker sun com> wrote:
OK - well....let's see. I'll fill in what I know, but I need the Speech
Dispatcher folks to fill in what they know.
GNOME Speech:
Just a thin wrapper over a TTS engine.
Doesn't do audio management - leaves that to the TTS engine.
Drivers for Festival, FreeTTS, DECtalk, IBMTTS/ViaVoice,
Loquendo, eSpeak, Cesptral/Swift, Eloquence, and even a
wrapper for SpeechDispatcher. No support for DECtalk
Express.
At a minimum, callbacks supported at the utterance level,
where an utterance is the chunk of text tossed at it via
a single speak command. Callbacks are also supported at
the word progress level if the engine supports it.
Mostly just sends text off to the speech synthesis engine
for speaking. The only 'extra' stuff that's really done
is adding index marks to text strings to be notified of
speech progress at the word level for those TTS engines
that support it.
No real support for SSML.
Audio is controlled by the speech synthesis engine.
Bonobo/CORBA based, essentially locking it to GNOME
for all intents and purposes.
Speech services are discoverable and activatable as
system services (via Bonobo Activation).
Those skilled in the art and with knowledge of the TTS
engine's API can write a driver in a day. It's much more
difficult for those not skilled in the art. ;-)
Difficult to debug.
Will
David Bolter wrote:
> Will,
>
> That sounds very reasonable to me. Can you start it? :)
>
> cheers,
> D
> Willie Walker wrote:
>> Hi All:
>>
>> This speech issue is obviously one filled with passion and high
>> expectations. I think our ultimate end goal here is to find a
>> solution that works well and fits within the various constraints.
>>
>> The two solutions we've been talking about, gnome-speech and Speech
>> Dispatcher, both have their strengths and weaknesses, and I'm not sure
>> we all understand what they are. Nor do I think we all understand
>> what "works well" means and what the constraints are.
>>
>> As an exercise, what do you all think of us having a somewhat
>> impassioned and pragmatic discussion about the various features and
>> the current state of gnome-speech and Speech Dispatcher?
>>
>> Will
>> _______________________________________________
>> gnome-accessibility-list mailing list
>> gnome-accessibility-list gnome org
>> http://mail.gnome.org/mailman/listinfo/gnome-accessibility-list
>>
>
_______________________________________________
gnome-accessibility-list mailing list
gnome-accessibility-list gnome org
http://mail.gnome.org/mailman/listinfo/gnome-accessibility-list
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]