Re: [Bayonne-desktop] Re: GNOME "Telephony Application Programming Services" Proposed



[clipping gnome-hackers and gnome-announce from the reply list]

On Wed, 2003-02-05 at 18:31, David Sugar wrote:
> Bill,
> 
> there was actually a couple of reasons and additional things brought up here 
> which may have some relevance to gnome-speech and accessability efforts.  
> While it has not been announced yet, we are also starting a seperate GNU 
> package, GNU Alexandria, to develop e-government services for the blind.  
> However, our interest in speech and accessibility, as it relates to GNU 
> Bayonne, is in regards to supporting voice enabled devices and hosted 
> services, rather than in directly solving desktop user accessibility issues.

I think gnome-speech needn't be desktop-specific (as you also infer,
below).  Right now (version "0.2") it's fairly limited, but we have
drafts of what we propose for "1.0" in circulation.  I think we ought to
go ahead and post that to a wider audience soon, so you can have a
look.  It looks a lot like JSAPI, BTW, only de-java-cised. :-)  I
understand that JSAPI has a certain degree of acceptance and familiarity
in the field already so it's a reasonable model for pushing out to folks
in the speech community (so I'm told).

> 
> Issues in supporting effective speech recognition and generation in Bayonne 
> are ones that require support for many concurrent sessions simultaneously 
> rather than one desktop user.  However, there is no reason why we should not 
> make use of the same underlying speech or asr engines, api's, etc.  In 
> particular, in Bayonne, we are already able to extend even limited concurrent 
> speech generation resources through extensive use of smart caching which 
> restricts the need for activily sythensizing voice for each and every 
> instance of a caller.  

Cool; I guess gnome-speech would see that as an implementation issue. 
Gnome-speech is intended as a "service" API so clients need not all come
from one desktop, so a similar scenario should be compatible with the
gnome-speech APIs.

> To date, I am only aware of several tts systems that potentially may be freely 
> licensed; embrola, which I had never used,

mbrola actually has a pretty restrictive license, at least from the POV
of some potential users (i.e. can't be sold without prior permission,
non-commercial, non-military, etc.)  


>  festival, which has license 
> confusion and conflict right from the start (it has multiple licenses and 
> parts of such as the "Edinburg speech tools" that it uses, are under a 
> non-comm license,

AFAIK you can get a working speech tools stack with no comm
restrictions; some of the "UK-English" data is noncomm but the US voice
stuff isn't.

>  and many voices are proprietary licenses), and flite, which 
> is under a BSD-like license, but has very limited voices.  

There's also FreeTTS, which is flite-based and fairly new; it's at
version 1.1.1 now.  FreeTTS requires a Java VM, which is a disadvantage
for some, but it means that it's completely non-platform-specific. 
Reports are that once the VM is loaded, etc., it's considerably more
performant than Festival or Flite.

> Other things, like 
> rsynth, are probably not all that usable in a practical sense though I 
> suppose individual users can be "trained" to understand it's generated speech 
> :).  What are the gnome speech people working with?

Since gnome-speech is foremost a service API, we can support multiple
backends, even non-free ones (the drivers themselves are LGPL).  At the
moment we have working drivers for Eloquence/ViaVoice, Festival, and 
FreeTTS, and I believe drivers exist for DECtalk and serial DEC synths
as well, though they aren't in CVS yet.

Given the current lack of free ASR engines we'd probably be developing a
driver for Sphinx when we get to the voice-input side of things.

Anyhow I am pretty sure you can get a festival stack that's OK
license-wise, you just end up excluding certain bits (like mbrola
support and some voices).  FreeTTS is definitely free, and can use Flite
voices, but as you say voices for Flite and FreeTTS are pretty scarce at
the moment.  There's interest (and progress, I think) in better voice
creation tools for FreeTTS (and by extension, Flite), you might check
out the FreeTTS pages on sourceforce:
http://freetts.sourceforge.net/docs/index.php

Whatever you do, I hope we can coordinate efforts so that we
interoperate/share effort wherever our projects naturally overlap.

best regards,

Bill

> On Wednesday 05 February 2003 13:00, Bill Haneman wrote:
> > Hi David, etc:
> >
> > If you folks are at all looking at telephony-related text-to-speech and
> > speech-recognition issues in relation to your project, I urge you to
> > contact the "gnome-speech" team.  gnome-speech is mostly discussed on
> > gnome-accessibility-list gnome org, but I believe there is a
> > gnome-speech-specific mailing list as well.
> >
> > gnome-speech at the moment is focussed on speech output, but future
> > plans call for voice-activation/speech recognition APIs as well.
> > Gnome-speech defines a common API for speech service delivery regardless
> > of the back-end speech-synthesis or voice-recognition engines, and will
> > handle markup as well (though it currently doesn't implement markup
> > support).
> >
> > regards
> >
> > Bill Haneman
> > GNOME Accessibility Project
> >
> > On Tue, 2003-02-04 at 21:02, David Sugar wrote:
> > > This is very different from that.  GnomeMeeting is a "softphone client",
> > > if you will, where you use a desktop pc as a ip telephone.  TAPS is
> > > interested in supporting the 99% existing office environments where a
> > > telephone is still a seperate physical intrument on a typical office
> > > workers desk and phone service is typically conducted through a (formerly
> > > propertary) office pbx which can now be displaced by free phone systems
> > > such as GNU Bayonne.  This is not to say that TAPS has no role in IP
> > > telephony, for many of the same integration issues also can exist with
> > > stand-alone IP phones and softswitches that are kept in isolation from
> > > other system services and where the same need to see and control incoming
> > > calls or dial contact phone numbers automatically still exist.
> > >
> > > On Tuesday 04 February 2003 15:25, Damien Sandras wrote:
> > > > What would be the main difference between that system, and a system
> > > > using the existing GnomeMeeting and phones managed by GnomeMeeting
> > > > itself thanks to additional hardware? GnomeMeeting is already able to
> > > > interact with PC-To-Phone gateways and to interact with normal phone
> > > > devices.
> > > >
> > > > Is your goal very different?
> > > >
> > > > Thanks,
> > > >
> > > > Le mar 04/02/2003 à 21:08, David Sugar a écrit :
> > > > > Have you ever wondered why you have phone numbers collected in your
> > > > > Evolution address book, and yet must manually re-enter numbers on
> > > > > your office telephone when you wish to dial someone?  Would you like
> > > > > to be able to receive pop-up notification on who's calling you with
> > > > > the ability to route incoming calls you do not wish to answer under
> > > > > mouse control? To mark specific calls as billable clients as they
> > > > > occur to track their time for automated invoicing?
> > > > >
> > > > > We hope to be able to finally address these and other issues for both
> > > > > GHOME desktop users and business application developers in a new free
> > > > > software package being proposed to be known as GNOME TAPS.  This
> > > > > being a project proposal, I am establishing a project outline and a
> > > > > set of goals. With that, this announcement constitutes a call for
> > > > > help and request for input from other GNOME developers in
> > > > > establishing a functional implementation.
> > > > >
> > > > > GNOME TAPS will offer a C callable library to easily integrate
> > > > > telephony functions into existing GNOME applications.  It will use a
> > > > > common TCP based backend protocol to communicate directly with office
> > > > > telephone equipment and services, and ideally for use with free
> > > > > software based telephone systems such as GNU Bayonne.  It will
> > > > > include an applet to pop-up and support handling of incoming calls. 
> > > > > It will include a gnome control-center plugin for setting of
> > > > > telephony options.
> > > > >
> > > > > The first goal will be to support and demonstrate a C callable
> > > > > library and implement a backend service on GNU Bayonne to support the
> > > > > functionality of dialing numbers that may be clicked on or otherwise
> > > > > delivered under application control.  This would be followed by
> > > > > development of a GNOME desktop applet to respond to incoming calls,
> > > > > and finally work on a complete control center plugin to configure and
> > > > > manage GNOME TAPS.
> > > > >
> > > > > Initial discussion and design planning for GNOME TAPS will be carried
> > > > > out on a new mailing list, bayonne-desktop gnu org   This list is
> > > > > open to the public and may be subscribed by sending email to
> > > > > bayonne-desktop-request gnu org   Comments may also be sent to me
> > > > > directly, to sugar gnu org 
> > > > >
> > > > > _______________________________________________
> > > > > gnome-announce-list mailing list
> > > > > gnome-announce-list gnome org
> > > > > http://mail.gnome.org/mailman/listinfo/gnome-announce-list
> > >
> > > _______________________________________________
> > > gnome-hackers mailing list
> > > gnome-hackers gnome org
> > > http://mail.gnome.org/mailman/listinfo/gnome-hackers
-- 
Bill Haneman <bill haneman sun com>




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]