[g-a-devel]Gnome Speech Architecture Proposal - the IDL



Hi Marc, hi All,
 
here is a very crude draft of an IDL for my recent proposal of the Gnome-Speech architecture.
 
Draghi
 
---------------------------------------------------------------------------
 
#include <Bonobo_Unknown.idl>
 
module GnomeSpeech
{
    struct ParameterRange
    {
        string name;
        any minValue;
        any maxValue;
    };
 
    struct Parameter
    {
        string name;
        any    value;
    };
 
    interface Speaker : Bonobo::Unknown;    // forward declaration
 
    //-------
    // VOICE
    //-------
 
    interface Voice : Bonobo::Unknown
    {
        // EXCEPTIONS       
 
        exception ParameterNotSupported[];
        exception ParameterOutOfRange[];
        exception WrongValueType[];
 
        // ATTRIBUTES   
 
        readonly attr string     Name;
        readonly attr Speaker    OriginatingSpeaker;
 
        // FUNCTIONS
 
        boolean      registerMarkerListener (in string userCookie,  word flags, ??? callback);        
 
        void         say (in string text);
        void         shutUp();
        boolean      isSpeaking();
 
        ParameterList  getSupportedParameters ();
 
        any getParamaterValue (in string parameterName)
            raises (ParameterNotSupported);
               
        void setParameterValue (in string parameterName, any value)
            raises (ParameterNotSupported, ParameterOutOfRange, WrongValueType);
                       
    };
 
    typedef sequence <Voice> VoiceList;
 
    //---------
    // SPEAKER
    //---------
 
    interface Speaker : Bonobo::Unknown
    {
        // EXCEPTIONS
 
        exception VoiceAlreadyExists[];
 
        // ATTRIBUTES
 
        readonly attribute string Name;
        readonly attribute string TTSEngineName;
        
        // FUNCTIONS   
 
        ParameterList    getSupportedParameters ();
        ParameterRange   getParameterRange (in string ParameterName)
            raises (ParameterNotSupported);
 
        Voice            createVoice (in string voiceName)
            raises (VoiceAlreadyExists);
        void             removeVoice (in string voiceName);  
            raises (VoiceNotFound);
 
     };
 
    typedef sequence <Speaker> SpeakerList;
 
    //----------------
    // SPEECH MANAGER
    //----------------
   
    interface GnomeSpeechManager : Bonobo::Unknown
    {
 
        // EXCEPTIONS
        exception SpeakerNotSupported[];
        exception VoiceNotAvailable[];
                       
        // ATTRIBUTES
 
        readonly attribute string    Version;
        readonly attribute int       SpeakerCount;
        readonly attribute int       VoiceCount;
       
        // FUNCTIONS
 
        SpeakerList         getSpeakers();
        VoiceList           getVoices();       
 
        Speaker             getSpeaker (in string speakerName)
            raises (SpeakerNotSupported);
        Voice               getVoice (in string voiceName)
            raises (VoiceNotAvailable);
 
 
        Voice               getCurrentVoice ();
        Voice               setCurrentVoice (in string voiceName)
             raises (VoiceNotAvailable);    // returns the previously selected voice
 
 
        boolean              registerMarkerListener (in string userCookie,  word flags, ??? callback);        
 
        void                 say (in string text);
        void                 shutUp();
        boolean              isSpeaking();
       
    };
 
};
 
------------------------------------------------------
 
 
As I'm not familiar with bonobo and CORBA, I'd like to add some comments and raise some issues.
 
As you see we have three objects: the SpeechManager, the Speaker, and the Voice.
 
The SpeechManager is the main entry point for the client. The client will ask the SM for a list of available speakers (I'm not sure if we need the SpeakerCount, or we can get that  from the SpeakerList). getSpeaker will return a Speaker object if one is found with that name. I don't know if bonobo allows you that, but I have assumed so.
 
The Speaker object is here, to allow us to query for the avaialble parameter ranges. The Speaker can not speak ;-))). In order to speak, you need to ask the Speaker to create a Voice with createVoice. The client provides a name for the Voice (Speakears have predefined names!). createVoice will return you a Voice object with some default parameter settings.
 
What can we do with a Voice? We can change the supported parameters in the allowed ranges, and ask it to say something, shut up, etc. We also need to be able to receive notifications as speech markers (at least EOS). For that I provided the registerMarkerListener, but I have some doubts that this is the right way. Doesn't bonobo support some sort of standard event generation mechanism, like COM's connection points? If yes, we should use that instead of my home-brewed registerMarkerListener;
 
So the client the client keeps track of the voice objects it created, and generates speech throug the Voice objects. Using the GNOME Speech like this it's very flexible, and it solves also a problem that might arise in the future, if we want to allow simultaneous speaking of multiple voices.
 
But there is more.
 
I would like to introduce also the "current voice" approach, as it is more convenient for some clients. At the SpeechManager level I have provided a get/setCurrentVoice pair and doubled the speech functions from the Voice object (say, shutUp, isSpeaking, etc). Tipically the client will produce once, at the begining, the voices it wants, and then just switch betwen them before saying something. We expect SRS in Gnopernicus to use this approach. Given the Voices objects the impelemntation of the CurrentVoice concept is almost trivial.
 
I have also introduced getVoices at SpeechManager level. The ideea of making the voices global at this level comes from Peter's wish to allow Gnopernicus to interoperate with other self-voicing apps. With this, if GS will support multiple clients as a singleton, it is possible for a Gnopernicus aware self-voicing app to speak with a Gnopernicus defined voice. In this respect, it might be useful if we would also provide notification to the clients when Voices come and go.
 
Marc please make out of this pseudo-idl a real bonobo IDL ;-), if we want to go on with this approach.
 
Any comments and suggestions are welcome.
 
Best regards,
Draghi
 


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]