Re: Making gnome speak



On Tue, 24 Aug 1999, Ricardo [iso-8859-1] Fernández Pascual wrote:

>     Think in the possibilities: any gnome program would be able to speak
> just calling some gnome_say(gchar *text) function. Balsa (or whatever
> gnome mail client) could say "you have an email" with the only cost of a
> function call, and then start reading it aloud...

There is a program in CVS called gspeech that does just that.  It's not
very polished, but it does use festival to speak buttons and whatnot.

Both the cool thing about it and the problem is that it runs as a GTK
plugin.  That means all you have to to do enable an app to use speech
output is add --gtk-module=gspeech to the commandline of the program. 
That's it.  Where that breaks down is the simple fact that all
applications are going to have different requirements when it comes to
speaking to the user.  There is no such information available in GTK.

I'm the author of a package that does the other side of the coin, voice
recognition.  It's based on IBM's ViaVoice, and thus is not 'free' as a
whole (though my code is GPL'd), and can also run as a plugin, sorta.  The
cool part is that it supports construction of vocabularies based on
Glade's XML interface definitions, and thus is almost as easy to use as a
simple plugin.

However, all this is ad hoc, and not very well designed from a usability
standpoint.  I read most of a paper written by a guy who took PowerPoint
and rewrote it (yes, REWROTE) to be sane for voice-control.  It simply is
not possible in most cases to just slap voice-control or text-to-speech on
top of an existing application.  You're going to have to A) design the
application from the ground up to make sense with voice-control, and B)
mark it up with appropriate information so voice-control can be done.

I would propose that Gtk+ be extended to make it easier to tie in the bits
of information that are needed to properly describe a GUI in terms of
voice control and output, and probably extend Glade and libglade to set up
all this information.

I'm no expert in this, and I have no time to lead such an effort, but I
would gladly participate in it, to the extent of discussion and rewriting
gvoice as necessary to make it do what's needed.  I know there are plenty
of people out there who'd be using this kind of technology daily, but I
don't suppose that very many of these are programmers with time to lead
such a project.  Likely someone with a vested interest (say, a relation
with vision problems) is going to have to show up, if not to lead, then to
at least provide input.

I tried to contact the author of the paper I mentioned, but didn't get a
response.  But I would try to find other experts in the field who can get
on the list and help design this stuff.

TTYL,
    Omega

         Erik Walthinsen <omega@cse.ogi.edu> - Staff Programmer @ OGI
        Quasar project - http://www.cse.ogi.edu/DISC/projects/quasar/
   Video4Linux Two drivers and stuff - http://www.cse.ogi.edu/~omega/v4l2/
        __
       /  \             SEUL: Simple End-User Linux - http://www.seul.org/
      |    | M E G A           Helping Linux become THE choice
      _\  /_                          for the home or office user



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]