Re: VEDICS Speech Assistant
- From: Nischal Rao <rao nischal gmail com>
- To: bharat joshi <bharatjoshi1 gmail com>
- Cc: gnome-accessibility-list gnome org
- Subject: Re: VEDICS Speech Assistant
- Date: Fri, 21 May 2010 17:43:18 +0530
Makedict has some set of rules that determines what the pronunciation should be. if it can successfully make a pronunciation out of a given word, you can pronounce the word directly else it generates the pronunciation of the word's spelling.
we have kept the desktop interaction part separate from the speech recognition part (communication between the 2 happens through sockets. The speech recognizer sends the text to the desktop part). So if you want to use any other speech recognizer, all you need to do is create a socket connection with the desktop part and send the text recognized.
Desktop part generates the set of words currently visible on the screen to a file. That file can be used by any speech recognizer to create grammar and dictionary file.
2010/5/21 bharat joshi <bharatjoshi1 gmail com>
Answer to your first question, Vedics is context based. It generates the word set based on what is accessible on the front end.
Answer to your second question. Vedics uses MAKEDICT (https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/logios/) which generates the pronunciation of the words given to it. These words are given to MAKEDICT by the c code of VEDICS whch uses ATSPI to find the elements accessible. Now it recognizes english words perfectly, It also recognizes words of other languages. We have tried it with words from Hindi, Kannada (Indian Languages)(Words of hindi and kannada but written in english). Check out the video "Vedics screencast, Termination commands". In that video we create files with odd names like "q6pw" and "pcw" which connot be part of any language, yet vedics generated the pronunciation for them and recognizes them.
There are some cases to be considered here.
http is pronounced as h t t p itself whereas
DSL is pronounced as disel. This is because Makedict was able to make a proper pronunciation for it.
cdrom would be "c drom"
"Nesanga Nenena", is a telgu language word, but Vedics recognized it perfectly.
User must do some trail and error with other language words.
Currently we are still in the development phase.
We still have some features to be added to vedics like spelling words.
We have not yet written the installation program, so you could not find in repo. It will a week or two by the time we finish our exams and write the installation program.
And one more important feature to be added to the list is
8. No training required. we have tested it with different users using vedics. The 8 videos were made by 3 different people. A person can directly install, get familiar with commands and start using it. He does not require any training of voice. Vedics recognizes any voice.
2010/5/21 José Félix Ontañón <felixonta gmail com>
El 21 de mayo de 2010 11:01, bharat joshi <bharatjoshi1 gmail com>
Ya we know about GVC, and we have tried it also.
You can say VEDICS is a super set of GVC.
Some of the key features of VEDICS are
- Accuracy is much better as we use SPHINX-4.
- File System Navigation - >Navigating files and folders is very easy.
- Recognizes any thing- > Vedics is dynamic, in the sense it generates words and its pronunciation dynamically.For example, if we take a simple command like "run text editor", the front end changes as editor opens. Vedics generates a new list of words from the front end and produces its pronunciation and grammar files. This makes Vedics recognize any word. It can even recognize junk words like "hsjft"
- We can pause and start VEDICS through voice using "stop listening" and "start listening" command. In gvc, people had to use
mouse to do it. You can also quit Vedics through voice.
- Works perfectly on Ubuntu 9.10 and 10.04.
- Can access any element including checkbox, radio button,
links, lists etc....
- Popup menus like the one that opens on right click are also accessible.
I'm fascinating with the power of feature 3, it can generates pronun&grammar on-the-fly and context-based?
I suppose you mean that Vedics can recognize any word but in english languages, isn't? What about other languages?
Do we need a text/voice corpous to feed and training it in other languages?
And about feature 5, did you have some precompiled binaries or even debian packages for testing? I can't find in sourceforge any other thing than the svn repo.
gnome-accessibility-list mailing list
gnome-accessibility-list gnome org
Nischal E Raoblogs.sun.com/nischal
Join RVCE OSUM at http://osum.sun.com/group/rvceosum
] [Thread Prev