[g-a-devel]Gnome Speech Architecture Proposal - the IDL

From: "Draghi Puterity" <mp baum de>
To: "Marc Mulcahy" <marc mulcahy sun com>
Cc: <gnome-accessibility-devel gnome org>, "Thomas Friehoff" <tf baum de>
Subject: [g-a-devel]Gnome Speech Architecture Proposal - the IDL
Date: Thu, 16 May 2002 12:54:41 +0200

Hi Marc, hi All,

here is a very crude draft of an IDL for my recent proposal of the Gnome-Speech architecture.

Draghi

---------------------------------------------------------------------------

#include <Bonobo_Unknown.idl>

module GnomeSpeech

{

struct ParameterRange

{

string name;

any minValue;

any maxValue;

};

struct Parameter

{

string name;

any value;

};

interface Speaker : Bonobo::Unknown; // forward declaration

//-------

// VOICE

//-------

interface Voice : Bonobo::Unknown

{

// EXCEPTIONS

exception ParameterNotSupported[];

exception ParameterOutOfRange[];

exception WrongValueType[];

// ATTRIBUTES

readonly attr string Name;

readonly attr Speaker OriginatingSpeaker;

// FUNCTIONS

boolean registerMarkerListener (in string userCookie, word flags, ??? callback);

void say (in string text);

void shutUp();

boolean isSpeaking();

ParameterList getSupportedParameters ();

any getParamaterValue (in string parameterName)

raises (ParameterNotSupported);

void setParameterValue (in string parameterName, any value)

raises (ParameterNotSupported, ParameterOutOfRange, WrongValueType);

};

typedef sequence <Voice> VoiceList;

//---------

// SPEAKER

//---------

interface Speaker : Bonobo::Unknown

{

// EXCEPTIONS

exception VoiceAlreadyExists[];

// ATTRIBUTES

readonly attribute string Name;

readonly attribute string TTSEngineName;

// FUNCTIONS

ParameterList getSupportedParameters ();

ParameterRange getParameterRange (in string ParameterName)

raises (ParameterNotSupported);

Voice createVoice (in string voiceName)

raises (VoiceAlreadyExists);

void removeVoice (in string voiceName);

raises (VoiceNotFound);

};

typedef sequence <Speaker> SpeakerList;

//----------------

// SPEECH MANAGER

//----------------

interface GnomeSpeechManager : Bonobo::Unknown

{

// EXCEPTIONS

exception SpeakerNotSupported[];

exception VoiceNotAvailable[];

// ATTRIBUTES

readonly attribute string Version;

readonly attribute int SpeakerCount;

readonly attribute int VoiceCount;

// FUNCTIONS

SpeakerList getSpeakers();

VoiceList getVoices();

Speaker getSpeaker (in string speakerName)

raises (SpeakerNotSupported);

Voice getVoice (in string voiceName)

raises (VoiceNotAvailable);

Voice getCurrentVoice ();

Voice setCurrentVoice (in string voiceName)

raises (VoiceNotAvailable); // returns the previously selected voice

boolean registerMarkerListener (in string userCookie, word flags, ??? callback);

void say (in string text);

void shutUp();

boolean isSpeaking();

};

------------------------------------------------------

As I'm not familiar with bonobo and CORBA, I'd like to add some comments and raise some issues.

As you see we have three objects: the SpeechManager, the Speaker, and the Voice.

The SpeechManager is the main entry point for the client. The client will ask the SM for a list of available speakers (I'm not sure if we need the SpeakerCount, or we can get that from the SpeakerList). getSpeaker will return a Speaker object if one is found with that name. I don't know if bonobo allows you that, but I have assumed so.

The Speaker object is here, to allow us to query for the avaialble parameter ranges. The Speaker can not speak ;-))). In order to speak, you need to ask the Speaker to create a Voice with createVoice. The client provides a name for the Voice (Speakears have predefined names!). createVoice will return you a Voice object with some default parameter settings.

What can we do with a Voice? We can change the supported parameters in the allowed ranges, and ask it to say something, shut up, etc. We also need to be able to receive notifications as speech markers (at least EOS). For that I provided the registerMarkerListener, but I have some doubts that this is the right way. Doesn't bonobo support some sort of standard event generation mechanism, like COM's connection points? If yes, we should use that instead of my home-brewed registerMarkerListener;

So the client the client keeps track of the voice objects it created, and generates speech throug the Voice objects. Using the GNOME Speech like this it's very flexible, and it solves also a problem that might arise in the future, if we want to allow simultaneous speaking of multiple voices.

But there is more.

I would like to introduce also the "current voice" approach, as it is more convenient for some clients. At the SpeechManager level I have provided a get/setCurrentVoice pair and doubled the speech functions from the Voice object (say, shutUp, isSpeaking, etc). Tipically the client will produce once, at the begining, the voices it wants, and then just switch betwen them before saying something. We expect SRS in Gnopernicus to use this approach. Given the Voices objects the impelemntation of the CurrentVoice concept is almost trivial.

I have also introduced getVoices at SpeechManager level. The ideea of making the voices global at this level comes from Peter's wish to allow Gnopernicus to interoperate with other self-voicing apps. With this, if GS will support multiple clients as a singleton, it is possible for a Gnopernicus aware self-voicing app to speak with a Gnopernicus defined voice. In this respect, it might be useful if we would also provide notification to the clients when Voices come and go.

Marc please make out of this pseudo-idl a real bonobo IDL ;-), if we want to go on with this approach.

Any comments and suggestions are welcome.

Best regards,

Draghi

Follow-Ups:
- Re: [g-a-devel]Gnome Speech Architecture Proposal - the IDL
  - From: Michael Meeks

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]