Re: [orca-list] Punctuation, capital letters, exchange of characters and strings, generally error in the design of Orca

Thanks Milan!

One of the questions I have right now is the ability for a client to programmatically configure various things in SpeechDispatcher, such as pronunciations for words. In looking at the existing API, I'm not sure I see a way to do this. Nor am I sure if this is something that a speech dispatcher user needs to do done engine-by-engine or if there is a pronunciation dictionary that speech dispatcher provides for all output modules to use.

In addition, I'm still clear about the policy in place for providing support for features that are not supported by the engine itself. My confusion is the result of these snippets from one of Jan's messages (

    WW> OK - so, no additional processing of the text is done. It is
    WW> just dispatched.
    JB> Right.

I interpreted the answer of "Right" to mean that no additional processing of the text is done - the text is just sent directly to the engine. Meaning no verbalized punctuation, no capitalization, nothing, unless the engine supports it.

    WW> Here's where I get confused. 'Emulates' implies to me that some
    WW> processing is done in SD if the engine doesn't support it.
    JB> No.

I wasn't sure how to interpret "No", but my interpretation was that emulation was NOT done, and this seems to match my interpretation of "Right" above. But, maybe "No" meant something like "No, speech dispatcher itself doesn't do emulation, but that can be done at a lower layer in the speech dispatcher internals." If that's the case, from the client's point of view, it's still speech dispatcher, and the client can now depend upon speech dispatcher to do the emulation.

    WW> So, doesn't this mean that SD is not just a dispatcher, but it
    WW> also contains code to provide support for features missing in a
    WW> speech engine?
    JB> In current SD implementation  the emulation can be in its output
    JB> module.  The new python implementation is split into two parts:
    JB> 1. Message dispatching
    JB> 2. TTS API Provider

To me, this says emulation IS done, or at least it is possible. So, I'm confused.

Let me try to rephrase this question: from Orca's point of view, if text is handed off to speech dispatcher via speechd, will we be guaranteed that the appropriate emulation will be provided for features that are not supported by a speech engine? For example, if an audio cue is desired for capital letters, will the Orca user be guaranteed that something in Speech Dispatcher will play an audio icon for capitalization if the engine doesn't support this directly? Or, if verbalized punctuation is not supported by the engine, will the Orca user be guaranteed that something in Speech Dispatcher will emulate the support if the engine does not support this directly?


Milan Zamazal wrote:
"WW" == Willie Walker <William Walker Sun COM> writes:

    WW> As I dig into it more, I believe the idea is that the core
    WW> server of SpeechDispatcher is (and will remain?) a C-based
    WW> service that talks to TTS engines.  It listens on sockets and
    WW> communicates with external applications like Orca via Brailcom's
    WW> SSIP protocol.

Current Speech Dispatcher does so.  The new implementation will be
implemented completely in Python and will allow, in addition to SSIP,
other forms of communication.  Most notably it will allow direct Python
calls from clients such as Orca.

    WW> On the client side, there will be a number of language bindings
    WW> that provide convenience mechanisms for handling SSIP.

The new implementation supports SSIP, so current language bindings can
be used without change.  New features can be made accessible by
extending SSIP (and its bindings).

    WW> So, the Python bindings in question are really for helping
    WW> Python-based clients, such as Orca, to talk to the
WW> SpeechDispatcher service via SSIP.
No.  As the new Speech Dispatcher is basically a Python library, Orca
can call it directly.

    WW>   As my knowledge of SpeechDispatcher grows, so do my questions
    WW> about what is complete and what is planned.  Like the AT-SPI,
    WW> which is composed of the AT-SPI IDL, ATK, GAIL, a bridge for
    WW> ATK, a bridge for Java, C bindings for the client side, etc.,
    WW> there are a lot of components to SpeechDispatcher (TTSAPI, SSIP,
    WW> language bindings, TTS engine drivers, config files, etc.) that
    WW> make the learning curve a little steep.  Couple a lack of
    WW> knowledge about the complete SpeechDispatcher picture with the
    WW> possibility that things will be changing in the future, and it
    WW> gets a little daunting.

The first stable version of the Speech Dispatcher is planned to be fully
feature and SSIP compatible with current Speech Dispatcher.  From the
point of view of applications nothing will be *necessary* to change.
From the point of view of the user it is likely that the dotconf
configuration will be replaced by a Python based configuration.

As for the internal architecture there are about six major components:
Message dispatching mechanism (queueing, priorities, etc.), client
interfaces (SSIP etc.), output modules (speech, braille, etc.), common
TTS API, TTS drivers, configuration handling.  With the exception of
configuration handling, at least something basic has already been
written for all of the parts.  But not everything is complete,
completely working, and completely documented.  Actually some things are
still quite incomplete and a lot of work still remains.  But the overall
architecture has been already mostly defined in the sense of existent
source code.

I don't think there is anything there that brave men should be scared
of :-).  By organizing the source into separate Python libraries they
should be more easy to study and use.  And one needn't study all the
parts of the software.  Client writers care only about interfaces,
people interested in supporting TTS engines can focus just on writing
TTS drivers, users of alternative output devices will write new output
modules.  It should be all easier than it is today, I hope.
WW> I still need to do a more thorough examination of the Python
    WW> speechd bindings to see how well they map to the overall
    WW> requirements.

I don't consider it that important.  If the core Speech Dispatcher
Python libraries are flexible enough (and this is one of the design
goals), you can define any bindings you like on top of them.


Milan Zamazal

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]