Re: [orca-list] speech dispatcher (Re: punctuation not spoken properly when reviewingbycharacter in firefox)



Hello Will,

you are touching a very important point.  I'm cc'ing to Gnome
Accessibility devel list since this is very closely related to the
recent thread named "Thoughts on speech" there.  Please everyone reading
this on Orca list, let's continue the discussion on
gnome-accessibility-devel to avoid cross posting.

My remarks below:

Willie Walker wrote:
The other is determining where text preprocessing such as lexical
analysis and word/grapheme expansion/substitution should be done.  The
decision we made when working on Orca was to not depend upon the speech
synthesis engine to provide these features because all speech synthesis
engines do them differently (it at all).  Early on, my phone was already
ringing because the experience was different between DECtalk users and
FreeTTS users.  We started putting these features in the TTS layer for
Orca.  The resulting model was to rely upon the speech engine to
basically say what we told it to say, except for perhaps doing things
such as abbreviation expansion and guessing proper pronunciations for
homographs.

Yes, I very much understand the motivation for this approach.  We also
had this in the original design of Speech Dispatcher, but we later
abandoned this approach.  This issue was even re-raised just about a
week ago on the Speech Dispatcher mailing-list .  See
http://lists.freebsoft.org/pipermail/speechd/2008q1/001106.html for a
brief, but very informative discussion.

Anyway, I believe it is a "nice to have" feature in Orca, but  it should
be possible to turn it of.  Some synths can do a better job themselves,
but it is a good fallback for dumb synths or for people who prefer
consistency.  Please note, that the Orca controlled text expansion may
not work for everyone.  While it works pretty well for English, it may
be a pain for a non English speaker when the translations are not
available -- in this case even a dumb synthesizer may give much better
results.

Since that time, we now have experimental speech dispatcher support in
Orca.  Among all the other stuff it does, speech dispatcher provides
additional text preprocessing features that overlap with Orca's, and the
two sometimes compete.  As a result, there is still work to do to
identify when/where Orca's text preprocessing should be used and
when/where speech dispatcher's should be used.  And, as is suggested,
when to let the speech synthesis engine itself provide this
functionality.  This is a very complex problem, both for the user and
for the code.  For example, it starts becoming very difficult to
identify where things are occurring.  For example, you mention that
"Espeak will say capital before each such word".  This is not something
I was aware of in my experiences with eSpeak, and my first guess was
that speech dispatcher was actually doing this.

No, Speech Dispatcher does no text processing itself.  From this point
of view it really is just a dispatcher.  See the above mentioned thread
in speech dispatcher's mailing list for more details.  All text
expansion is currently done in the synthesizer.

Well, with Orca and the Speech Dispatcher backend, there are still
places, where the expansion is done by Orca before it gets to the speech
 API, so we have no other choice that speak what orca sends us.  This is
one of the most important problems of the current Speech Dispatcher
backend.  Some refactoring is required to solve this, although I don't
think it is a lot of work.  Greater part of it is covered by the
following bugzilla entry: http://bugzilla.gnome.org/show_bug.cgi?id=440114

Moving forward, I would like to see a common speech service for the
whole system.  This speech service should be available for many
applications, both via the text and graphical consoles.  I would like to
see this system provide features for handling capital letters,
punctuation, pronunciation dictionaries, multilingual text, ACSS
definitions to group speaking attributes such as voice/pitch/rate/etc,
etc..

Some of it is already provided by Speech Dispatcher.  The other features
are part of the specification of the TTS API.  See
http://www.freebsoft.org/tts-api.  The TTS API is being implemented as
part of the new Speech Dispatcher architecture.

From my perspective, it would be great if much of this support
could be done somewhere besides Orca because it would simplify Orca. 
Speech dispatcher is definitely something that seems to be close to
providing this, and community members such as Kenny Hitt, David
Csercsics, and Tomas Cerha have been working to make it more stable and
appropriate.

Yes, and I believe my work on Orca speech API refactoring is important
regardless of the choice of the speech system.  The bug which started
this discussion is a good example of things which must be changed in
Orca to provide a good level of separation of the speech related
functionality and the screen reading functionality.  If such a level of
separation is present, it will be far easier to switch to any other
speech system in future, not just Speech Dispatcher.

Finally, one note about the current state of the Speech Dispatcher
backend in Orca.  I've already said that I no longer consider it
experimental.  There may still be problems similar to the one discussed
above, but these are not problems of the backend itself.  Most of these
problems don't reveal when Gnome Speech is used, but they reveal with
Speech Dispatcher.  From this point of view it can be said that the
support for Speech Dispatcher is still experimental to some extent, but
it must be understood, that the problem is not in Speech Dispatcher nor
in the Orca Speech Dispatcher backend itself.  I'm not blaming anyone
for these problems, it was me, who introduced them after all.  I just
want to avoid confusion.

Best regards

Tomas



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]