Re: [Bayonne-desktop] Re: GNOME "Telephony Application Programming Services" Proposed



In GNU Bayonne, all we assume the tts system can do is be given text to 
sythensize along with properties (pitch, voice, etc), and an output filename 
for the tts system to create synthesized audio into.  In that sense, it 
doesn't care about the "backend" either, or even about the API involved, so 
long as the API permits that much to be accomplished.  However, the JSAPI is 
a strong api model for this, and we could certainly make use of a common 
C/C++ GNOME library for this purpose, or try to exercise the limited 
functionality required by Bayonne to achieve that effect through such an API.

We also looked at the same issues in regard with ASR, and also came up with 
trying to work with Sphinx.  I currently have someone experimenting with it 
for a project for the blind, and we were originally looking to integrate it 
directly into Bayonne.  However, it would be worth everyone's while to look 
at if a common API can be developed that Bayonne simply uses, as with what 
you suggest for tts.

David

On Wednesday 05 February 2003 14:31, Bill Haneman wrote:
> [clipping gnome-hackers and gnome-announce from the reply list]
>
> On Wed, 2003-02-05 at 18:31, David Sugar wrote:
> > Bill,
> >
> > there was actually a couple of reasons and additional things brought up
> > here which may have some relevance to gnome-speech and accessability
> > efforts. While it has not been announced yet, we are also starting a
> > seperate GNU package, GNU Alexandria, to develop e-government services
> > for the blind. However, our interest in speech and accessibility, as it
> > relates to GNU Bayonne, is in regards to supporting voice enabled devices
> > and hosted services, rather than in directly solving desktop user
> > accessibility issues.
>
> I think gnome-speech needn't be desktop-specific (as you also infer,
> below).  Right now (version "0.2") it's fairly limited, but we have
> drafts of what we propose for "1.0" in circulation.  I think we ought to
> go ahead and post that to a wider audience soon, so you can have a
> look.  It looks a lot like JSAPI, BTW, only de-java-cised. :-)  I
> understand that JSAPI has a certain degree of acceptance and familiarity
> in the field already so it's a reasonable model for pushing out to folks
> in the speech community (so I'm told).
>
> > Issues in supporting effective speech recognition and generation in
> > Bayonne are ones that require support for many concurrent sessions
> > simultaneously rather than one desktop user.  However, there is no reason
> > why we should not make use of the same underlying speech or asr engines,
> > api's, etc.  In particular, in Bayonne, we are already able to extend
> > even limited concurrent speech generation resources through extensive use
> > of smart caching which restricts the need for activily sythensizing voice
> > for each and every instance of a caller.
>
> Cool; I guess gnome-speech would see that as an implementation issue.
> Gnome-speech is intended as a "service" API so clients need not all come
> from one desktop, so a similar scenario should be compatible with the
> gnome-speech APIs.
>
> > To date, I am only aware of several tts systems that potentially may be
> > freely licensed; embrola, which I had never used,
>
> mbrola actually has a pretty restrictive license, at least from the POV
> of some potential users (i.e. can't be sold without prior permission,
> non-commercial, non-military, etc.)
>
> >  festival, which has license
> > confusion and conflict right from the start (it has multiple licenses and
> > parts of such as the "Edinburg speech tools" that it uses, are under a
> > non-comm license,
>
> AFAIK you can get a working speech tools stack with no comm
> restrictions; some of the "UK-English" data is noncomm but the US voice
> stuff isn't.
>
> >  and many voices are proprietary licenses), and flite, which
> > is under a BSD-like license, but has very limited voices.
>
> There's also FreeTTS, which is flite-based and fairly new; it's at
> version 1.1.1 now.  FreeTTS requires a Java VM, which is a disadvantage
> for some, but it means that it's completely non-platform-specific.
> Reports are that once the VM is loaded, etc., it's considerably more
> performant than Festival or Flite.
>
> > Other things, like
> > rsynth, are probably not all that usable in a practical sense though I
> > suppose individual users can be "trained" to understand it's generated
> > speech
> >
> > :).  What are the gnome speech people working with?
>
> Since gnome-speech is foremost a service API, we can support multiple
> backends, even non-free ones (the drivers themselves are LGPL).  At the
> moment we have working drivers for Eloquence/ViaVoice, Festival, and
> FreeTTS, and I believe drivers exist for DECtalk and serial DEC synths
> as well, though they aren't in CVS yet.
>
> Given the current lack of free ASR engines we'd probably be developing a
> driver for Sphinx when we get to the voice-input side of things.
>
> Anyhow I am pretty sure you can get a festival stack that's OK
> license-wise, you just end up excluding certain bits (like mbrola
> support and some voices).  FreeTTS is definitely free, and can use Flite
> voices, but as you say voices for Flite and FreeTTS are pretty scarce at
> the moment.  There's interest (and progress, I think) in better voice
> creation tools for FreeTTS (and by extension, Flite), you might check
> out the FreeTTS pages on sourceforce:
> http://freetts.sourceforge.net/docs/index.php
>
> Whatever you do, I hope we can coordinate efforts so that we
> interoperate/share effort wherever our projects naturally overlap.
>
> best regards,
>
> Bill
>
> > On Wednesday 05 February 2003 13:00, Bill Haneman wrote:
> > > Hi David, etc:
> > >
> > > If you folks are at all looking at telephony-related text-to-speech and
> > > speech-recognition issues in relation to your project, I urge you to
> > > contact the "gnome-speech" team.  gnome-speech is mostly discussed on
> > > gnome-accessibility-list gnome org, but I believe there is a
> > > gnome-speech-specific mailing list as well.
> > >
> > > gnome-speech at the moment is focussed on speech output, but future
> > > plans call for voice-activation/speech recognition APIs as well.
> > > Gnome-speech defines a common API for speech service delivery
> > > regardless of the back-end speech-synthesis or voice-recognition
> > > engines, and will handle markup as well (though it currently doesn't
> > > implement markup support).
> > >
> > > regards
> > >
> > > Bill Haneman
> > > GNOME Accessibility Project
> > >
> > > On Tue, 2003-02-04 at 21:02, David Sugar wrote:
> > > > This is very different from that.  GnomeMeeting is a "softphone
> > > > client", if you will, where you use a desktop pc as a ip telephone. 
> > > > TAPS is interested in supporting the 99% existing office environments
> > > > where a telephone is still a seperate physical intrument on a typical
> > > > office workers desk and phone service is typically conducted through
> > > > a (formerly propertary) office pbx which can now be displaced by free
> > > > phone systems such as GNU Bayonne.  This is not to say that TAPS has
> > > > no role in IP telephony, for many of the same integration issues also
> > > > can exist with stand-alone IP phones and softswitches that are kept
> > > > in isolation from other system services and where the same need to
> > > > see and control incoming calls or dial contact phone numbers
> > > > automatically still exist.
> > > >
> > > > On Tuesday 04 February 2003 15:25, Damien Sandras wrote:
> > > > > What would be the main difference between that system, and a system
> > > > > using the existing GnomeMeeting and phones managed by GnomeMeeting
> > > > > itself thanks to additional hardware? GnomeMeeting is already able
> > > > > to interact with PC-To-Phone gateways and to interact with normal
> > > > > phone devices.
> > > > >
> > > > > Is your goal very different?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Le mar 04/02/2003 à 21:08, David Sugar a écrit :
> > > > > > Have you ever wondered why you have phone numbers collected in
> > > > > > your Evolution address book, and yet must manually re-enter
> > > > > > numbers on your office telephone when you wish to dial someone? 
> > > > > > Would you like to be able to receive pop-up notification on who's
> > > > > > calling you with the ability to route incoming calls you do not
> > > > > > wish to answer under mouse control? To mark specific calls as
> > > > > > billable clients as they occur to track their time for automated
> > > > > > invoicing?
> > > > > >
> > > > > > We hope to be able to finally address these and other issues for
> > > > > > both GHOME desktop users and business application developers in a
> > > > > > new free software package being proposed to be known as GNOME
> > > > > > TAPS.  This being a project proposal, I am establishing a project
> > > > > > outline and a set of goals. With that, this announcement
> > > > > > constitutes a call for help and request for input from other
> > > > > > GNOME developers in establishing a functional implementation.
> > > > > >
> > > > > > GNOME TAPS will offer a C callable library to easily integrate
> > > > > > telephony functions into existing GNOME applications.  It will
> > > > > > use a common TCP based backend protocol to communicate directly
> > > > > > with office telephone equipment and services, and ideally for use
> > > > > > with free software based telephone systems such as GNU Bayonne. 
> > > > > > It will include an applet to pop-up and support handling of
> > > > > > incoming calls. It will include a gnome control-center plugin for
> > > > > > setting of telephony options.
> > > > > >
> > > > > > The first goal will be to support and demonstrate a C callable
> > > > > > library and implement a backend service on GNU Bayonne to support
> > > > > > the functionality of dialing numbers that may be clicked on or
> > > > > > otherwise delivered under application control.  This would be
> > > > > > followed by development of a GNOME desktop applet to respond to
> > > > > > incoming calls, and finally work on a complete control center
> > > > > > plugin to configure and manage GNOME TAPS.
> > > > > >
> > > > > > Initial discussion and design planning for GNOME TAPS will be
> > > > > > carried out on a new mailing list, bayonne-desktop gnu org   This
> > > > > > list is open to the public and may be subscribed by sending email
> > > > > > to bayonne-desktop-request gnu org   Comments may also be sent to
> > > > > > me directly, to sugar gnu org 
> > > > > >
> > > > > > _______________________________________________
> > > > > > gnome-announce-list mailing list
> > > > > > gnome-announce-list gnome org
> > > > > > http://mail.gnome.org/mailman/listinfo/gnome-announce-list
> > > >
> > > > _______________________________________________
> > > > gnome-hackers mailing list
> > > > gnome-hackers gnome org
> > > > http://mail.gnome.org/mailman/listinfo/gnome-hackers




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]