Orca Introduction, comments and a question or two

Hi all,

I've just started playing with Orca on a shiny new Edgy install and I must say that speech engine aside (more below) it's been a surprisingly pleasant experience. Given the relatively young status of Orca I think it's in great shape and shows tremendous potential. Hats off to all involved in the effort.

My road to getting Orca was a slightly convoluted one though much of that can be categorised as self-inflicted wounds. After years of relying on console access to Linux via either a DOS terminal session or Speakup directly I've wanted to experiment with the Gnome tools for some time. Due to other activities and the general home IT set-up my strong preference was to do so under VMware Workstation with my main Windows box as the host.

When I first tried this several months ago I discovered that most software speech engines turn into random noise engines under VMware. I strongly suspect this is due to some channel timing issues but can't prove that. I never got a software speech engine working well enough under VMware to be usable so gave up and only returned to the topic this weekend. I know that some other people were hitting their heads against the same wall so can report that the Swift voices from Cepstral now work 'out of the box' under VMware; previously they required to be set in mono mode which couldn't easily be done from a higher-level speech layer.

So my set-up now is using a Cepstral Swift voice via Festival, using the "festivalify-cepstral-voice" Perl script provided by Cepstral. Hack Festival and you can make the Swift voice the default and this provides a working TTS engine under VMware. Hopefully that info will be of use to someone else.

That leads nicely into the first of my three questions, I find the performance of Orca under this set-up to be pretty sluggish. I suspect the problem isn't really Orca itself, more a combination of a beta OS, the use of Festival as an intermediary layer and the VMware overhead. Playing with some C test apps on the VM I am suspecting the Festival machinery as the native Swift code seems more responsive than via Festival. Anyone got experience with using Festival in this way or care to point the finger elsewhere?

My next point may have me stirring up a hornet's nest but I'll take the risk. There seems to be an embarrassment of riches when it comes to intermediary TTS layers. There's Gnome-speech, there's Speech Dispatcher and as I've found even good old Festival can act in this role. I know that all these tools evolved from different backgrounds and had distinct initial needs they were trying to meet but the situation now seems somewhat duplicative to me. I may be betraying my background in distributed systems where abstraction layers get slapped atop anything that moves but is this situation as confused as I find it and if so is there any sort of convergence going forward? Alternatively am I missing the point somewhere?

Final question is less controversial. I find that setting the voice rate in the Orca control panel has no effect on the actual spoken speed. Insert and left/right arrows tell me the rate is being increased or decreased but I find it either has no effect or actually does the reverse. I'm suspecting this a consequence of my funky Festival/Swift set-up as I don't see any such bugs on Bugzilla and I guess this would be an obvious one if it was universal.

Many thanks for any input and I look forward to exploring Orca more over the next days.


Garry Turkington
garry turkington gmail com

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]