Re: [g-a-devel] speech recognition



Hi Willie,

Thanks for the info. 

I've been thinking about speech recognition engines and where they
should be integrated. I now think that the window manager is not the
place for it, it should be at a 'lower' level.

I'm currently hacking on a little daemon which uses the sphinx2
recognition engine to convert speech to text after which it sends this
text to the keyboard driver (using the uinput device driver). This
means that I'll be able to use my voice to 'type' every keyboard
character. (My current implementation already does this for a limited
set of characters)

Because it's on the keyboard level, the x-server configuration does
not need to be changed (it just listens to the keyboard device driver)
and I can 'type' in every application (I will use it for instance to
switch modes in vi and bash). This implementation is window manager
independend and you can 'type' in any terminal (so you don't even need
x).

The gnome accessibility framework can be used o make GNOME's
applications interface more suitable for this kind of input handling
(for instance showing options on screen like GOK does and numbering
them for easy recognition, numbering links in the browser or lines in
an editor, word completion etc etc.)

What do you think about this approach? 

Kind Regards,

Peter 







On 8/8/05, Willie Walker <William Walker sun com> wrote:
> Hi Peter:
> 
> > I was wondering about the current status of speech recognition in the
> > gnome-speech library. I've browsed through the sources and cvs but I
> > only found out that an API version 1.0 is in the making which includes
> > speech input processing. Where do I find the current API version and
> > what is the speech input status?
> 
> The current API version is 0.3.7, which was release earlier this year.
> We are currently developing assistive technologies (e.g., Gnopernicus
> and Orca) against this API as a means to better validate its viability.
> 
> At one point, there was an effort to model the gnome-speech API after
> the Java Speech API (JSAPI), but that has since gone dormant.  It may
> be worthwhile to investigate this some more as it might address some
> of the issues and limitations (e.g., no speech input) of the current
> 0.3.7 API.
> 
> Hope this helps,
> 
> Will
> 
>



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]