Re: Thoughts on speech



Willie Walker wrote:
> * While it may not be the pristine perfect solution, Speech Dispatcher 
> seems to fit nicely into the overall requirements we need as a larger 
> community.

Hello, I am not saying SD is pristine perfect, but are there any
concrete indications for saying it is NOT?

>    o Is the configuration simple enough and/or can it be made simpler?

The main point is that (unlike Gnome Speech) it IS configurable and it
makes a number different usage scenarios possible.  I don't think,
however, that its configuration is something to be done by end users
unless they want to experiment.  Speech Dispatcher should be configured
by the package maintainer for the os/distribution.  All the settings
which are interesting for a typical end user, such as voice, rate etc.,
are controlled through Orca so the user doesn't deal with Speech
Dispatcher configuration directly.

>    o What additional work needs to be done to integrate it more tightly 
> with Orca?

There are ways to look at this problem:
  1) What needs to be done in Speech Dispatcher and its Orca backend to
     support all the functionality needed in the current Orca design
  2) What changes in Orca design can be done to make use of the
     capabilities offered by Speech Dispatcher

For 1), there are two concrete things to do:
  a) Solve the verbalization issuses described at
     http://bugzilla.gnome.org/show_bug.cgi?id=440114
  b) Solve word boundary callbacks (see below).

For 2) I won't go into much detail now, but the advantages would most
often be in cleaner code than in end user visible features.  That's
simply because of the fact, that SD API uses a higher level of
abstraction than Gnome Speech.  There is a well defined system of
interaction between different kinds of messages and SD automatically
makes sure that certain messages don't get interrupted by others etc.
Orca currently relies just on SPEAK and STOP and handles all the
interaction itself (for example by storing the last time key echo was
performed in a global variable and testing this variable on places,
where keyboard echo interruption is not desirable).  Other advantage is
that there is no need for ANY driver specific code.  The interface is
completely transparent from this point of view and all driver specific
code is located in SD output modules.

Ok, I promised to tell more about the callbacks.  SD currently does not
support automatic callbacks on word boundary, because it supports
callbacks through index marks in SSML.  So the text would need to be
marked in Orca (in its SD backend) by inserting index marks at word
boundaries.  When I was implementing this in Orca backend, I discovered,
that Orca currently doesn't handle these callbacks, so I left this
functionality out.  If Orca decides to use these word boundary
callbacks, this will need to be added to the SD backend.

I have, however, one concern about that -- the correct detection of word
boundaries is a language dependent operation and thus the only correct
place where it should really be done is the speech synthesizer because
only the synthesizer knows the lexical structure of the text.  On the
other hand, if the intended use is only for positioning the cursor, it
might be a reasonable simplification to only "guess" the boundaries
using a regular expression, without the intention to be absolutely
lexically correct.  This needs some more consideration and Will's
feedback.  Adding automatic word boundary callbacks to Speech Dispatcher
(without the need for SSML and index marks) is also a viable option.

I believe it is also important to consider other possible advantages of
using SSML within Orca.  This may have a major impact on the future of
Orca speech capabilities and the API.

Best regards,

Tomas











[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]