Re: [g-a-devel] [orca-list] speech dispatcher (Re: punctuation not spoken properly when reviewingbycharacter in firefox)



Hello Will,

you are touching a very important point.  I'm cc'ing to Gnome
Accessibility devel list since this is very closely related to the
recent thread named "Thoughts on speech" there.  Please everyone reading
this on Orca list, let's continue the discussion on
gnome-accessibility-devel to avoid cross posting.

My remarks below:

Willie Walker wrote:
> The other is determining where text preprocessing such as lexical
> analysis and word/grapheme expansion/substitution should be done.  The
> decision we made when working on Orca was to not depend upon the speech
> synthesis engine to provide these features because all speech synthesis
> engines do them differently (it at all).  Early on, my phone was already
> ringing because the experience was different between DECtalk users and
> FreeTTS users.  We started putting these features in the TTS layer for
> Orca.  The resulting model was to rely upon the speech engine to
> basically say what we told it to say, except for perhaps doing things
> such as abbreviation expansion and guessing proper pronunciations for
> homographs.

Yes, I very much understand the motivation for this approach.  We also
had this in the original design of Speech Dispatcher, but we later
abandoned this approach.  This issue was even re-raised just about a
week ago on the Speech Dispatcher mailing-list .  See
http://lists.freebsoft.org/pipermail/speechd/2008q1/001106.html for a
brief, but very informative discussion.

Anyway, I believe it is a "nice to have" feature in Orca, but  it should
be possible to turn it of.  Some synths can do a better job themselves,
but it is a good fallback for dumb synths or for people who prefer
consistency.  Please note, that the Orca controlled text expansion may
not work for everyone.  While it works pretty well for English, it may
be a pain for a non English speaker when the translations are not
available -- in this case even a dumb synthesizer may give much better
results.

> Since that time, we now have experimental speech dispatcher support in
> Orca.  Among all the other stuff it does, speech dispatcher provides
> additional text preprocessing features that overlap with Orca's, and the
> two sometimes compete.  As a result, there is still work to do to
> identify when/where Orca's text preprocessing should be used and
> when/where speech dispatcher's should be used.  And, as is suggested,
> when to let the speech synthesis engine itself provide this
> functionality.  This is a very complex problem, both for the user and
> for the code.  For example, it starts becoming very difficult to
> identify where things are occurring.  For example, you mention that
> "Espeak will say capital before each such word".  This is not something
> I was aware of in my experiences with eSpeak, and my first guess was
> that speech dispatcher was actually doing this.

No, Speech Dispatcher does no text processing itself.  From this point
of view it really is just a dispatcher.  See the above mentioned thread
in speech dispatcher's mailing list for more details.  All text
expansion is currently done in the synthesizer.

Well, with Orca and the Speech Dispatcher backend, there are still
places, where the expansion is done by Orca before it gets to the speech
 API, so we have no other choice that speak what orca sends us.  This is
one of the most important problems of the current Speech Dispatcher
backend.  Some refactoring is required to solve this, although I don't
think it is a lot of work.  Greater part of it is covered by the
following bugzilla entry: http://bugzilla.gnome.org/show_bug.cgi?id=440114

> Moving forward, I would like to see a common speech service for the
> whole system.  This speech service should be available for many
> applications, both via the text and graphical consoles.  I would like to
> see this system provide features for handling capital letters,
> punctuation, pronunciation dictionaries, multilingual text, ACSS
> definitions to group speaking attributes such as voice/pitch/rate/etc,
> etc..

Some of it is already provided by Speech Dispatcher.  The other features
are part of the specification of the TTS API.  See
http://www.freebsoft.org/tts-api.  The TTS API is being implemented as
part of the new Speech Dispatcher architecture.

> From my perspective, it would be great if much of this support
> could be done somewhere besides Orca because it would simplify Orca. 
> Speech dispatcher is definitely something that seems to be close to
> providing this, and community members such as Kenny Hitt, David
> Csercsics, and Tomas Cerha have been working to make it more stable and
> appropriate.

Yes, and I believe my work on Orca speech API refactoring is important
regardless of the choice of the speech system.  The bug which started
this discussion is a good example of things which must be changed in
Orca to provide a good level of separation of the speech related
functionality and the screen reading functionality.  If such a level of
separation is present, it will be far easier to switch to any other
speech system in future, not just Speech Dispatcher.

Finally, one note about the current state of the Speech Dispatcher
backend in Orca.  I've already said that I no longer consider it
experimental.  There may still be problems similar to the one discussed
above, but these are not problems of the backend itself.  Most of these
problems don't reveal when Gnome Speech is used, but they reveal with
Speech Dispatcher.  From this point of view it can be said that the
support for Speech Dispatcher is still experimental to some extent, but
it must be understood, that the problem is not in Speech Dispatcher nor
in the Orca Speech Dispatcher backend itself.  I'm not blaming anyone
for these problems, it was me, who introduced them after all.  I just
want to avoid confusion.

Best regards

Tomas


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]