Re: Thoughts on speech



Hi Steve:

The IDL for the main speaker interface is here (it's pretty short/simple):

http://svn.gnome.org/svn/gnome-speech/trunk/idl/GNOME_Speech_Speaker.idl

The main driver interface, which is basically what you discover when using Bonobo, is here:

http://svn.gnome.org/svn/gnome-speech/trunk/idl/GNOME_Speech_SynthesisDriver.idl

"Extra" stuff, such as pitch, rate, punctuation, etc., is buried in the parameter information. You need to hunt around to find it, and it's all handled by convention versus a real spec. An example of a more complete set of parameters can be found at the end of this file:

http://svn.gnome.org/svn/gnome-speech/trunk/drivers/viavoice/viavoicespeaker.c

Hope this helps!

Will

Steve Lee wrote:
Brilliant Will, I've just leant a whole lot, thanks
What is the main API for speech? Something like speak( sText ) ?

Steve

On 07/03/2008, Willie Walker <William Walker sun com> wrote:
OK - well....let's see.  I'll fill in what I know, but I need the Speech
 Dispatcher folks to fill in what they know.

 GNOME Speech:

   Just a thin wrapper over a TTS engine.

   Doesn't do audio management - leaves that to the TTS engine.

   Drivers for Festival, FreeTTS, DECtalk, IBMTTS/ViaVoice,
   Loquendo, eSpeak, Cesptral/Swift, Eloquence, and even a
   wrapper for SpeechDispatcher.  No support for DECtalk
   Express.

   At a minimum, callbacks supported at the utterance level,
   where an utterance is the chunk of text tossed at it via
   a single speak command.  Callbacks are also supported at
   the word progress level if the engine supports it.

   Mostly just sends text off to the speech synthesis engine
   for speaking.  The only 'extra' stuff that's really done
   is adding index marks to text strings to be notified of
   speech progress at the word level for those TTS engines
   that support it.

   No real support for SSML.

   Audio is controlled by the speech synthesis engine.

   Bonobo/CORBA based, essentially locking it to GNOME
   for all intents and purposes.

   Speech services are discoverable and activatable as
   system services (via Bonobo Activation).

   Those skilled in the art and with knowledge of the TTS
   engine's API can write a driver in a day.  It's much more
   difficult for those not skilled in the art.  ;-)

   Difficult to debug.

 Will


 David Bolter wrote:
 > Will,
 >
 > That sounds very reasonable to me.  Can you start it?  :)
 >
 > cheers,
 > D
 > Willie Walker wrote:
 >> Hi All:
 >>
 >> This speech issue is obviously one filled with passion and high
 >> expectations.  I think our ultimate end goal here is to find a
 >> solution that works well and fits within the various constraints.
 >>
 >> The two solutions we've been talking about, gnome-speech and Speech
 >> Dispatcher, both have their strengths and weaknesses, and I'm not sure
 >> we all understand what they are.  Nor do I think we all understand
 >> what "works well" means and what the constraints are.
 >>
 >> As an exercise, what do you all think of us having a somewhat
 >> impassioned and pragmatic discussion about the various features and
 >> the current state of gnome-speech and Speech Dispatcher?
 >>
 >> Will
 >> _______________________________________________
 >> gnome-accessibility-list mailing list
 >> gnome-accessibility-list gnome org
 >> http://mail.gnome.org/mailman/listinfo/gnome-accessibility-list
 >>
 >

 _______________________________________________
 gnome-accessibility-list mailing list
 gnome-accessibility-list gnome org
 http://mail.gnome.org/mailman/listinfo/gnome-accessibility-list






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]