Hi All,Thanks very much to everyone who provided feedback on my and Draghi's drafts of gnome-speech IDL. Here is a new draft of gnome-speech IDL which I hope incorporates feedback from all of you who provided it.
1. I've added event callbacks. Note that I have encapsulated speech events in a structure which currently contains an event type and a long integer detail. this currently encapsulates speech start, stop, and text offset events. Do we need event time stamps? I thought encapsulating the event in a struct would be a good approach moving forward so we can easily add more event info if we need it. 2. I've added a few other functions for getting the voices of an engine. You can now get all voices, only the default voices (perfect Paul, beautiful betty, etc.), the voices which match a parameter, or only the user defined voices. The idea behind getting voices based on a parameter was an initial attempt to provide the functionality which Paul insightfully suggested of selecting voices based on gender, age, language, etc. Does this seem like a reasonable approach? 3. More feedback from Paul resulted in the addition of the pause and resume functions for pausing and resuming speech. 4. There is also now a function to delete voices. 5. To complement the driverInit function, there is now a driverShutdown function. Does this seem useful? 6. I have not yet addressed the issue of how audio should be handled. I.E., how should one set the audio format and output device, etc. Since we plan to support external synthesizers, is there a good general way to support this? Comments welcome. 7. Paul raises a good issue concerning text normalizations. Shouldthe say function perform any standard ones, for example, numeric processing? Or, in addition, should we provide a dictionary lookup?
8. Added a sayURL command to speak a larg amount of prepared text as per Paul's suggestion.I have not adopted Draghi's proposal concerning shifting gnome-speech to a speaker-based approach because I don't think it is as straightforward and implementable as this proposal. Comments welcome, but I propose moving forward with this engine-centric proposal. If it turns out that Gnopernicus implements useful code on top of gnome-speech that we deem generally useful, we'll re-vamp gnome-speech 2.0 to take advantage of it.
Regards, Marc
Attachment:
SynthesisDriver.idl
Description: Binary data