[g-a-devel]Another draft of gnome-speech IDL



Hi All,

Thanks very much to everyone who provided feedback on my and Draghi's drafts of gnome-speech IDL. Here is a new draft of gnome-speech IDL which I hope incorporates feedback from all of you who provided it.

1. I've added event callbacks.  Note that I have encapsulated speech
events in a structure which currently contains an event type and a
long integer detail.  this currently encapsulates speech start, stop,
and text offset events.  Do we need event time stamps?  I thought
encapsulating the event in a struct would be a good approach moving
forward so we can easily add more event info if we need it.

2. I've added a few other functions for getting the voices of an
engine.  You can now get all voices, only the default voices (perfect
Paul, beautiful betty, etc.), the voices which match a parameter,
or only the user defined voices.  The idea behind getting voices
based on a parameter was an initial attempt to provide the
functionality which Paul insightfully suggested of selecting voices
based on gender, age, language, etc.  Does this seem like a reasonable
approach?

3. More feedback from Paul resulted in the addition of the pause and
resume functions for pausing and resuming speech.

4. There is also now a function to delete voices.

5. To complement the driverInit function, there is now a
driverShutdown function.  Does this seem useful?

6. I have not yet addressed the issue of how audio should be handled.
I.E., how should one set the audio format and output device, etc.
Since we plan to support external synthesizers, is there a good
general way to support this?  Comments welcome.

7. Paul raises a good issue concerning text normalizations.  Should
the say function perform any standard ones, for example, numeric processing? Or, in addition, should we provide a dictionary lookup?

8. Added a sayURL command to speak a larg amount of prepared text as
per Paul's suggestion.


I have not adopted Draghi's proposal concerning shifting gnome-speech to a speaker-based approach because I don't think it is as straightforward and implementable as this proposal. Comments welcome, but I propose moving forward with this engine-centric proposal. If it turns out that Gnopernicus implements useful code on top of gnome-speech that we deem generally useful, we'll re-vamp gnome-speech 2.0 to take advantage of it.


Regards,

Marc

Attachment: SynthesisDriver.idl
Description: Binary data



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]