[g-a-devel]GNOME Speech IDL
- From: Willie Walker <william walker sun com>
- To: Bill Haneman <bill haneman sun com>
- Cc: Michael Meeks <michael ximian com>, Marc Mulcahy <marc mulcahy sun com>, accessibility mailing list <gnome-accessibility-devel gnome org>, Rich Burridge <Rich Burridge sun com>, Paul Lamere <Paul Lamere sun com>, peter korn sun com, Marney Beard <marney beard sun com>
- Subject: [g-a-devel]GNOME Speech IDL
- Date: Thu, 10 Oct 2002 08:56:29 -0400
Hi All:
My feeling is that this is now approaching the slippery slope
of creating a more complete speech API. That is, my original
understanding was that this was meant to be an API solely for
screen readers. It now appears to be turning into an API for
speech for the entire desktop. This is much farther reaching
and needs to be approached very carefully.
Given the expanding role, I think it would be prudent to
investigate existing API's such as SAPI or JSAPI. These
have been created with input from many speech engine vendors
and application developers and are relatively stable. I think
a GNOME speech API modeled after an existing API would yield
some very fruitful results and would also reduce the learning
curve for developers already familiar with speech.
I'm somewhat partial to JSAPI over SAPI since the general input
I get from the speech community is that JSAPI is a more complete
and less ambiguous specification than SAPI. When it comes to
providing consistent behavior across speech engines, the less
ambiguous the spec is, the better. You can find the JSAPI specs
at:
http://java.sun.com/products/java-media/speech/
There's also an IETF effort, SpeechSC, that's looking to address
distributed speech:
http://www.ietf.org/html.charters/speechsc-charter.html
If you have any questions, please send them on to us at
speech-core east sun com -- the Sun speech team is not on the
gnome accessibility or gnome speech development lists.
Willie Walker
Principal Investigator
Speech Group
Sun Microsystems Laboratories
Bill Haneman wrote:
>
> On Thu, 2002-10-10 at 09:43, Michael Meeks wrote:
> > Hi Marc,
> ...
> > In which case - I'd very strongly recommend making the 'language' a
> > 'string' which is reasonably extensible - instead of an enum which
> > (being flat) just isn't.
>
> I agree, and I think Paul from the FreeTTS team pointed this out also.
> There are standardized "locale" strings, etc. used for textual output
> already, we should probably use them or another ISO-standard locale
> string format if available.
>
> Paul had some other comments which I think have merit though I don't
> agree with his proposed solutions in some cases.
>
> Also Paul mentioned in a meeting that he felt the voice services should
> have some "global" configurability. This is somewhat at odds with the
> possibility that speech servers might be shared, remote, etc.
>
> I think we might consider that the "default" settings for drivers and
> their voices (i.e. those parameters used by gnome-speech if the client
> does not explicitly set them) something independent of gnome-speech IDL,
> perhaps controllable via gconf at the gnome-speech level. That way a
> user could go into a "Speech" control panel on the GNOME desktop and set
> things like default driver, preferred (default) voices, default speech
> rate, etc. This would address his point without adding weird "global"
> APIs to what is otherwise a client-centric service API.
>
> -Bill
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]