Orca Speech API



Hello Will and all,

As I promised at Guadec, here are my brief comments to the current
speech API in Orca.  Hope you find that usefull.  As we agreed, the goal
is to make the API as simple as possible.  Some features may be
implemented in the layer above (the 'speech' module in Orca), some in
the layers below (different speech server implementations).

My comments to existing SpeechServer instance methods:

getInfo(self):

  This method is often used as 'server.getInfo()[0]' or
  'server.getInfo()[1]', thus it might be more practical
  to have two methods: 'name()' and 'id()'.

getVoiceFamilies(self):

  Ok.

queueText(...), queueTone(...), queueSilence(...):

  These methods are not used anywhere in Orca code, so
  they may be removed.

speakCharacter(self, character, acss=None):

  Ok.

isSpeaking(self):

  Is not used within Orca, except for the http interface.
  It is a question whether it is used by anything behind
  this interface, but it would be definitely nice to avoid
  this method at all.

speak(self, text=None, acss=None, interrupt=True):

  Ok.

sayAll(self, utteranceIterator, progressCallback):

  This is the most complex point of the interface.  In fact, that
  does the same thing as SSML.  You can change voices and their
  properties within the text and you get progress notifications.
  I'm mentioning that because it might be practical to use SSML
  directly, since it doesn't have certain limitations.  For example
  you can change voice properties within a sentence, without
  breaking it into pieces (and breaking the lexical structure for
  the synthesizer).  Did you consider such problems and are you
  satisfied with the current solution?

speakUtterances(self, list, acss=None, interrupt=True):

  This method seems redundant to me.  The same effect could be
  achieved by queuing the messages by the speak command (with
  'interrupt' set to False).  Or am I missing something?
  The 'acss' argument is never used within Orca, BTW.

increaseSpeechRate(self, step=5), decreaseSpeechRate(self, step=5),
increaseSpeechPitch(self, step=0.5), decreaseSpeechPitch(self,step=0.5):

  The argument 'step' is never used, so it might be omitted.  Moreover,
  it might be better to implement increasing and decreasing in a
  layer above and only set an absolute value at the speech API level.

stop():

  Ok.

shutdown():

  Ok.

reset():

  Ok.  Maybe the same effect could be achieved by shutting down and
  creating a new instance?

In addition, I suggest a new method 'speakKey(self, key)'.  Currently
key names are constructed by Orca and spoken using the 'speak' method,
but  some backends (such as Speech Dispatcher) then loose a chance to
handle keys in a better way, such as playing a sound instead of the key
name or caching the synthesized key name for a key identifier, etc.  The
current Orca handling can be used for particular speech servers which
don't handle keys themselves.

Another concern I have is using UNICODE.  Currently, AFAIK, Orca works
with UTF-8 strings.  This is not a very Pythonic approach.  Better is to
use Python's unicode type internally and only encode/decode to UTF-8 on
output/input.  This would have many practical advantages (especially
when handling character offsets in callback contexts).  I don't know
what are your plans in this respect, so would be grateful if you could
let me know.

Well, this is all for now.  I'll be glad to discuss any of the problems
in more detail.

Kindest regards

Tomas Cerha

-- 
Brailcom, o.p.s. http://www.brailcom.org
Free(b)soft project http://www.freebsoft.org
Eurochance project http://eurochance.brailcom.org



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]