Re: [orca-list] speech quality

Melissa <melissa_orca mbfw net> wrote:
I was wondering if the Orca project has considered using larger
stored speech units for the speech synthesizer in order to improve
speech quality.

Some synthesizers (Espeak being one of them) don't use recorded speech samples
at all, if I understand correctly. It computes speech wave-forms based on
parameters, somewhat like DECTALK and other synthesizers of the 1980s. This
method is known as formant synthesis, and that's as much as I know about it.

On the other hand, Festival, for example, combines recorded speech segments to
generate an utterance. SVOX Pico (now usable with Speech-Dispatcher and hence
with Orca) uses hidden Markov models, i.e., statistical models based
ultimately on recorded samples. Unfortunately we don't have access to the
tools to generate these in the case of SVOX Pico. I don't know what techniques
OpenMary uses. There are voices for Festival that work with hidden Markov
models as well; these seem to be the new and evolving technology in speech
synthesis research at the moment.

