[orca-list] eSpeak Voice Editing: Tips and Qs

From: Veli-Pekka Tätilä <vtatila gmail com>
To: orca-list gnome org
Subject: [orca-list] eSpeak Voice Editing: Tips and Qs
Date: Tue, 22 Jul 2008 08:01:51 +0300

Hi list,
I wonder if eeSpeak stufff is on topic, I hope so. I discoverd yesterday
quite an easy method of editing eSpeak voices, as instant as though I
almost had a GUI with an audition command. Launch gedit after

sudo su

in the terminal, go to

/usr/share/espeak-data/voices/en/

or whatever dir your eSpeak voices live. Next select a rarely used voice
for Orca, go and edit that voice wih the help of the eSpeak voice page:

http://espeak.sourceforge.net/voices.html

When you want to test it do

alt+f, s, enter, orca+q, alt+f2, orca, enter. THis would be quicker if
Orca had a hotkey for relaunching but alas I haven't seen a graphical
hotkey binder applet for Gnome. The idea is that you merely save
changes, quit and relaunch Orca, and then can have an instant preview of
the saved voice.

And now to my question:
I like US ENglish and breathy formant voices as far as speech synthesis
goes. The sort of voice you might get

in Orpheus 1.x with voicing 50, pitch 75 and speaker table 3,
in ViaVoice with breathyness 50 and a small headsize,
or in Dectalk with a low-pitched Frank voice or a tweak to Paul like
this:
 [:np][:dv ri 0 sm 70] Do I sound more mellow?

You can hear voice samples of all of these at my speech synths review
page in:

http://vtatila.kapsi.fi/reviews_of_speech_synths.html

Is eSpeak up to the challenge and if so could anyone share their tweaked
voices? One would think eSpeak could do this using additive synthesis
out of sines, so just smartly cut the highs and leave the fundamental as
is. I'm normally using the US English voice finding it much easier to
understand fast than UK, in most synths, as a non native speaker.

Yet most of my eSpeak tweaks didn't sound too good. Playing with voicing
sounds like it adds lowpass filtered noise. The tone parameter is like a
4-band semi parametric EQ, which isn't what I'm after, either. The
formants seem powerful, sounding a bit like boosting or cutting the
amplitudes of a band pass filter of a vocoder, say in a virtual analog
synth, but I have no idea how to make good use of them. Most of them
make narrow, unpleasant sounding changes. Maybe cutting stuff around 300
hz to remove muddy frequencies, and boosting highs around 1 or 2 Khz
slightly would give a more EQ:ed, radio voice, especially if you played
with the delay effect, echo, to simluate a reverb with a short decay. As
You might have noticed I'm a lot into synths analog and virtual analog,
as well as audio.

Any help appreciated.

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]