Re: Festival vs. F-lite disk space requirements

From: Willie Walker <William Walker Sun COM>
To: Gnome Accessibility <gnome-accessibility-list gnome org>
Cc: Bill Haneman <Bill Haneman Sun COM>
Subject: Re: Festival vs. F-lite disk space requirements
Date: Fri, 24 Feb 2006 09:24:08 -0500

Hi:

If one looks at the Venn diagram of Festival/Flite/FreeTTS, thecommon logic is quite similar, but Festival's circle is much muchlarger than the other two. The common logic from this Venn diagrambasically consists of a process of preprocessing input text,selecting units to play, combining them, and then playing them.Unfortunately, the data files are typically the largest hunk of data,and each engine uses its own format (Festival == Binary/ASCII ESTfiles, Flite == C code, FreeTTS == Binary/ASCII of its own ilk). Inretrospect, if we had understood the EST file format and Festivalsignal processing code better, FreeTTS probably could have usedFestival data directly instead of duplicating the RELP-based approachin Flite.

Festival has a core set of logic (Scheme interpreter, scheme files,signal processing code, etc.) to deal with voice data, and one candownload voice data separately for it. I think the bulk of the dataconsists mostly of pronunciation lexicons and usually some processedform of the actual voice recordings (i.e., the "group" file). To geta small set for perhaps en_US only, one could take a look at usingthe kal_diphone voice data and attempt to discover only the barestuff needed to make it work. You'd probably also want to keep theMBROLA support in there as it tends to be small and can interfacewith MBROLA (a separate download) to get you better sounding voices.In any case, I *think* you're looking at about 22Meg total for aminimal kal_diphone-based en_US festival install, though I think youmight be able to get 4Meg or so smaller if it's possible to prunesome what-I-think-might-be-redundant lexicon data. These numbers arebased upon my install of festival on my FC4 machine, and may actuallybe extra large because I've been goofing with the ARCTIC and HTSvoice support. As an aside, there's some odd interaction between thefestival server and gnome-speech on Ubuntu, which causes the CPU tothrottle to 100%. I took a quick poke at this at one time, and itlooks like there's something bogus happening on the socket/pipecommunication between the two. My current commitments and pressureshaven't afforded me the time to really dig into and solve theproblem, though. :-(

Flite (a C-based engine) is based on data imported from Festival.Last I knew, its voice data files are compiled in as source code andyou get what you get. I'm not sure there is opportunity for pruning,though you could get rid of the unit selection voice that's good foronly speaking the time. But...it still tends to be what it is: asmall, fast, runtime engine. :-) It's been a long time since Ilooked at the code, so I don't remember sizing information, but Ithink it is the smallest of the bunch. There's also no direct gnome-speech support for it, other than indirectly through the recentlyadded speech-dispatcher driver for gnome-speech (thanks Hynek!).Given resources, one probably could write a gnome-speech driver forflite and bypass this indirection.

FreeTTS (a Java-based engine) is based on logic from Festival andFlite, though it really is mostly a Flite clone in Java. LikeFestival, it consists of core logic that can operate on voice data.To get a small set, you could ship only what's needed for the kevinvoices, but you'd probably also want to keep the MBROLA supportbecause it has similar benefits as what you get with Festival. Ithink the total would be about 6.5Meg or so, but then you will alsoneed the Java virtual machine.


Hope this helps, and please let me know if you have any more questions,

Will

I think Flite uses the same file format. ( Will, please correct me if
I'm wrong).  It also requires a Java JRE, are you planning to include
Java in the live CD?


BTW, in the past, Java was required for  OpenOffice.org accessibility,
but that's not true of the latest version.

Bill

On Fri, 2006-02-24 at 12:25, Henrik Nilsen Omma wrote:

Hello,

We are working on packaging screen reader support for the Ubuntu Live

CD, but have gotten ourselves a little confused regarding filesizes ...


Being a Live CD we are quite limited on disk space. We were thinking
that we should use the smaller F-lite, rather than the full Festival,
assuming it had smaller speech files. However, because gnome-speech

doesn't have direct support for F-lite we also needed to includespeech

dispatcher (and gnome-speech from CVS), so it begins to grow.

Can anyone shed some light on the relative space requirements of

Festival vs. F-lite? Does Festival include all it's supportedlanguagesby default or are they packaged separately (as packaged inDebian)? We

would be happy to settle for English-only support this time around.

Thank you. Any advice will be greatly appreciated.

- Henrik


_______________________________________________
gnome-accessibility-list mailing list
gnome-accessibility-list gnome org
http://mail.gnome.org/mailman/listinfo/gnome-accessibility-list


_______________________________________________
gnome-accessibility-list mailing list
gnome-accessibility-list gnome org
http://mail.gnome.org/mailman/listinfo/gnome-accessibility-list

Follow-Ups:
- Re: Festival vs. F-lite disk space requirements
  - From: Willie Walker

References:
- Festival vs. F-lite disk space requirements
  - From: Henrik Nilsen Omma
- Re: Festival vs. F-lite disk space requirements
  - From: Bill Haneman

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]