[g-a-devel]Re: GNOME Speech IDL



Hi all,

I've got to totally agree with Will here. A lot of developers out there
are very familiar with JSAPI, and several vendors have already done
implementations. If the GNOME Speech API was based on JSAPI, then these
developers would be very comfortable doing a GNOME Speech equivalent,
perhaps even sharing a lot of common code. Full documentation is already
done. Speech recognition as well as TTS is already done.

I wonder whether there is a Java-to-IDL compiler out there. If there was
this would make the generation of the various IDL files a fairly simple
task. 


------Forwarded Message------
>From william walker sun com  Thu Oct 10 09:02:07 2002
From: Willie Walker <william walker sun com>
Subject: GNOME Speech IDL
Sender: William Walker sun com
To: Bill Haneman <bill haneman sun com>
Cc: Michael Meeks <michael ximian com>, Marc Mulcahy <marc mulcahy sun com>,
 accessibility mailing list <gnome-accessibility-devel gnome org>,
 Rich Burridge <Rich Burridge Sun COM>, Paul Lamere <Paul Lamere sun com>,
 peter korn sun com, Marney Beard <marney beard sun com>
Reply-to: William Walker sun com

Hi All:

My feeling is that this is now approaching the slippery slope 
of creating a more complete speech API.  That is, my original 
understanding was that this was meant to be an API solely for 
screen readers.  It now appears to be turning into an API for 
speech for the entire desktop.  This is much farther reaching
and needs to be approached very carefully.

Given the expanding role, I think it would be prudent to 
investigate existing API's such as SAPI or JSAPI.  These 
have been created with input from many speech engine vendors 
and application developers and are relatively stable.  I think 
a GNOME speech API modeled after an existing API would yield 
some very fruitful results and would also reduce the learning 
curve for developers already familiar with speech.

I'm somewhat partial to JSAPI over SAPI since the general input
I get from the speech community is that JSAPI is a more complete 
and less ambiguous specification than SAPI.  When it comes to
providing consistent behavior across speech engines, the less
ambiguous the spec is, the better.  You can find the JSAPI specs 
at:

    http://java.sun.com/products/java-media/speech/

There's also an IETF effort, SpeechSC, that's looking to address 
distributed speech:

    http://www.ietf.org/html.charters/speechsc-charter.html

If you have any questions, please send them on to us at
speech-core east sun com -- the Sun speech team is not on the 
gnome accessibility or gnome speech development lists.

Willie Walker
Principal Investigator
Speech Group
Sun Microsystems Laboratories

Bill Haneman wrote:
> 
> On Thu, 2002-10-10 at 09:43, Michael Meeks wrote:
> > Hi Marc,
> ...
> >       In which case - I'd very strongly recommend making the 'language' a
> > 'string' which is reasonably extensible - instead of an enum which
> > (being flat) just isn't.
> 
> I agree, and I think Paul from the FreeTTS team pointed this out also.
> There are standardized "locale" strings, etc. used for textual output
> already, we should probably use them or another ISO-standard locale
> string format if available.
> 
> Paul had some other comments which I think have merit though I don't
> agree with his proposed solutions in some cases.
> 
> Also Paul mentioned in a meeting that he felt the voice services should
> have some "global" configurability.  This is somewhat at odds with the
> possibility that speech servers might be shared, remote, etc.
> 
> I think we might consider that the "default" settings for drivers and
> their voices (i.e. those parameters used by gnome-speech if the client
> does not explicitly set them) something independent of gnome-speech IDL,
> perhaps controllable via gconf at the gnome-speech level.  That way a
> user could go into a "Speech" control panel on the GNOME desktop and set
> things like default driver, preferred (default) voices, default speech
> rate, etc.  This would address his point without adding weird "global"
> APIs to what is otherwise a client-centric service API.
> 
> -Bill




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]