Re: [orca-list] SSML (Re: more than one blank space are not recognized in thunderbird)



Hello,
A lot of good points there, I think I may need to explain a bit more on some of what I meant.

Should the problem of synths handling utf-8, specific characters, etc be a responsibility of orca, I said no. Now I know there's very little we can do with certain synths (eg. viavoice) but I was thinking when it isn't possible to get the synth to work properly then these work arounds should be in gnome-speech or speech-dispatcher. As I understand it gnome-speech is going to be dropped in gnome 3.0 and speech-dispatcher will become the preferred choice. So may be this sort of fixing (eg. trying to get viavoice speak things like no break spaces) should be fixed in the speech-dispatcher viavoice driver as this is the part which specifically deals with viavoice and the problem may not exist or may show itself differently for other synths.

Regarding the SSML, I am not sure whether it handles the punctuation level type stuff. I was thinking that SSML could be used to insert pauses where needed (I thought there was one case mentioned back along the history of this thread, something to do with mnemonics). Also SSML allows you to specify how something should be pronounced (EG. the w3c page I gave a link to gave an example of using the <sub> element to expand w3c for speech but maintain the original text should you also want a text version as well.

I don't know how many synths support SSML (espeak does, some of the viavoice synths do), but in the cases where SSML isn't supported then its the job of speech-dispatcher to handle this (that's my view of what speech-dispatcher should do if it doesn't).

So I wasn't trying to say SSML is the answer to all the problems, it may help with some of the issues. Also I wouldn't expect this to be something soon, I know there's plenty going on (eg. 2.28 is coming up so fixes should be made ready for that release, some time soon at-spi on dbus should be happening as it should be ready for gnome 3.0, etc) its something to consider for longer term.

Michael Whapples
On 14/08/09 00:37, Willie Walker wrote:
Hi Michael:

* When reading by characters I certainly would prefer no breaking space.
* When orca is reading it as part of a flow of text (eg. reading the whole message) then I would prefer orca to treat it as a space to give better sense of the sentence.

I think we can do this and it is what I'd lean towards as well. My hands are absolutely full and overflowing with stuff at the moment, but I'll try to get a patch in place.

Although I have said the above, I am very much of the opinion that this problem lies very much with the synths and/or the layers giving access to the synths (eg. gnome-speech and speech-dispatcher). As shown by the variety of different reports on output, I feel its too specific for something like orca to be attempting to resolve.

I agree with this for the most part. However, we're in a situation where fixing synths, with the exception of perhaps eSpeak and festival, can be like asking the government to patch the pothole in the road in front of your house.

Indeed, it probably can be done and it probably should be done, but it can also be a lot easier to grab the handlebars on your bike and go around the hole until you get "Proposition Pothole" on the ballot, get it to pass in the next town vote, get the aldermen to meet with the budget committee, get the budget committee to hire a design firm, and then wait for the union shop supervisor, the 3 sub-supervisors, 15 guys in orange vests, and 2 traffic cops show up with shovels, barriers, and dump trucks full of way too much blacktop to tear up the entire road and then come back a few months later only to say they've overspent their budget, can't finish the work, and the road is now in worse shape than it was before you tried to do the right thing.

Also would speech synthesis markup language (ssml) http://www.w3.org/TR/speech-synthesis/ help here?

Markup is just that: markup. SSML provides a convenient way to structure the output and associate attributes (pitch, rate, etc.) for a portion of text. Other than that, there's no magic that it provides to turn a goose into a swan.

SSML may help with being able to more easily specify when verbalized punctuation should be used. I haven't looked at the spec in along time to even remember if it supports levels of verbalized punctuation. For some reason, I recall that portion of the spec being somewhat ambiguous, so I suspect it's not the silver bullet for verbalization. However, markup is not a cure for issues in the synthesizer. SSML:

* won't solve the problem of the synth barfing on no-break space characters - if you tell the synth to speak something, it should speak it.

* won't solve the problem of the synth behaving differently according to whether utterances are sent to the engine as one complete string or separate strings - one synth may do prosody at the paragraph level, for example, whereas another may do prosody at the sentence or phrase level.

* won't solve the problem of the synth behaving differently if the string ends in a '.' or not - it's really up to the synth designer to decide whether they want to treat an utterance that is not terminated with a punctuation mark as a complete sentence or not.

* etc.

So, we still need to grab the handlebars and steer around the hole. :-(

However, I do agree that we should try to push as much stuff as we can as far down in the stack as we can and this is where we need to work with Luke Yelavich to try to reach a practical middle ground between fixing it in Orca, fixing it in speech dispatcher, or fixing it in the synth. There are many things to consider -- verbalized punctuation, verbalized caps, ACSS styles for caps, pronunciation, ACSS styles in general, etc.

Will

Michael Whapples
On 13/08/09 20:38, Willie Walker wrote:
I opened http://bugzilla.gnome.org/show_bug.cgi?id=591734 and http://bugzilla.gnome.org/show_bug.cgi?id=591724 to track the two main problems associated with this.

http://bugzilla.gnome.org/show_bug.cgi?id=591724 tracks the punctuation being spoken twice

http://bugzilla.gnome.org/show_bug.cgi?id=591724 tracks the no break space problem

They might be related in that a raw no-break space UTF-8 character is being sent to the synthesis engine, which might be interpreting it and speaking it as something else (e.g., a punctuation character). For now, I'm treating them as separate problems.

In any case, I can reproduce the no-break space problem and can resolve it. I cannot reproduce the dual punctuation problem.

My question for the list is whether Orca should speak "no break space" or "space" when it encounters a no-break space? In many applications, they appear identically on the screen. I believe OOo, however, shows them as gray boxes (though I also believe it exposes them via the AT-SPI as plain old space characters).

So - what do you want?  "no break space", "space", something else?

Will

jose vilmar estacio de souza wrote:
Running with eSpeak I got:

1. The prase is read correctly when reading all the document or reviewing by line.

2. If there is a word ended with a . and two blanks, the "." is read together with the word.

3. If I remove a blank after the "." and read the line again, the "." is not read.

4. Reviewing char by char, the first blank is not announced

[]S Josà Vilmar EstÃcio de Souza
http://www.informal.com.br
Msn:vilmar informal com br Skype:jvilmar
Twitter: http://www.twitter.com/jvesouza
Phone: +55 21-2555-2650 Cel: +55 21-8868-0859


On 08/13/2009 03:22 PM, Willie Walker wrote:
> I am running Ibm via voice as my primary synthesizer.

IBM ViaVoice has been somewhat problematic. Can you try eSpeak and see if you experience the same problem?

Will

jose vilmar estacio de souza wrote:
Hi Will,
No matter If I am composing messages in plain text format or HTML format.

Also the same problem happens if I am reading a message.

Another observation is that if I am reading the entire message or reviewing by line, the word with more than 1 blankn after it is spelled instead of read.
The following text:
" test test "
is read as:
" t e s t t e s t "

I am running Ibm via voice as my primary synthesizer.
Thanks.

[]S Josà Vilmar EstÃcio de Souza
http://www.informal.com.br
Msn:vilmar informal com br Skype:jvilmar
Twitter: http://www.twitter.com/jvesouza
Phone: +55 21-2555-2650 Cel: +55 21-8868-0859


On 08/13/2009 02:44 PM, Willie Walker wrote:
Hi All:

When reproducing this problem, does it matter whether you are composing messages in plain text format or HTML format?

Will

Michael Whapples wrote:
Hello,
Yes I find that too, however I didn't thing it was a bug, I thought it was related to the characters thunderbird uses when there are multiple spaces. In fact I have just confirmed what thunderbird uses by looking at the source for your message and I find the first two spaces in your example are inserted as non-breaking spaces (html character ). Now should orca announce those, if so add such a bug/request for it.

Michael Whapples
On -10/01/37 20:59, Paul Hunt wrote:
Confirmed.

In fact I didn't even need to create a new message. I got that problem when reviewing your message below. the first two spaces inbetween the words are not spoken (nothing is), only the third space is spoken.

I have confirmed that this problem occurs when composing too.

Paul



On 13/08/09 11:15, jose vilmar estacio de souza wrote:
Hi all,
Probably this is related to thunderbird but I decided to report so that someone can confirm.

To reproduce try the following:

1. Create a new message using thunderbird.

2. In the body of the message type the following text:
"This is a test"
Without the quotations.
You must type 3 (three) blanks between each word.

3. Try to review the text using the left and right keys.

Note that some spaces are not announced.
Thanks.





_______________________________________________
Orca-list mailing list
Orca-list gnome org
http://mail.gnome.org/mailman/listinfo/orca-list
Visit http://live.gnome.org/Orca for more information on Orca.
The manual is at http://library.gnome.org/users/gnome-access-guide/nightly/ats-2.html
The FAQ is at http://live.gnome.org/Orca/FrequentlyAskedQuestions
Netiquette Guidelines are at http://live.gnome.org/Orca/FrequentlyAskedQuestions/NetiquetteGuidelines

_______________________________________________
Orca-list mailing list
Orca-list gnome org
http://mail.gnome.org/mailman/listinfo/orca-list
Visit http://live.gnome.org/Orca for more information on Orca.
The manual is at http://library.gnome.org/users/gnome-access-guide/nightly/ats-2.html
The FAQ is at http://live.gnome.org/Orca/FrequentlyAskedQuestions
Netiquette Guidelines are at http://live.gnome.org/Orca/FrequentlyAskedQuestions/NetiquetteGuidelines


_______________________________________________
Orca-list mailing list
Orca-list gnome org
http://mail.gnome.org/mailman/listinfo/orca-list
Visit http://live.gnome.org/Orca for more information on Orca.
The manual is at http://library.gnome.org/users/gnome-access-guide/nightly/ats-2.html
The FAQ is at http://live.gnome.org/Orca/FrequentlyAskedQuestions
Netiquette Guidelines are at http://live.gnome.org/Orca/FrequentlyAskedQuestions/NetiquetteGuidelines




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]