Re: VEDICS Speech Assistant

From: Nischal Rao <rao nischal gmail com>
To: bharat joshi <bharatjoshi1 gmail com>
Cc: gnome-accessibility-list gnome org
Subject: Re: VEDICS Speech Assistant
Date: Fri, 21 May 2010 17:43:18 +0530

Makedict has some set of rules that determines what the pronunciation should be. if it can successfully make a pronunciation out of a given word, you can pronounce the word directly else it generates the pronunciation of the word's spelling.

@helge
we have kept the desktop interaction part separate from the speech recognition part (communication between the 2 happens through sockets. The speech recognizer sends the text to the desktop part). So if you want to use any other speech recognizer, all you need to do is create a socket connection with the desktop part and send the text recognized.

Desktop part generates the set of words currently visible on the screen to a file. That file can be used by any speech recognizer to create grammar and dictionary file.

2010/5/21 bharat joshi <bharatjoshi1 gmail com>

Hi,

Answer to your first question, Vedics is context based. It generates the word set based on what is accessible on the front end.
Answer to your second question. Vedics uses MAKEDICT (https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/logios/) which generates the pronunciation of the words given to it. These words are given to MAKEDICT by the c code of VEDICS whch uses ATSPI to find the elements accessible. Now it recognizes english words perfectly, It also recognizes words of other languages. We have tried it with words from Hindi, Kannada (Indian Languages)(Words of hindi and kannada but written in english). Check out the video "Vedics screencast, Termination commands". In that video we create files with odd names like "q6pw" and "pcw" which connot be part of any language, yet vedics generated the pronunciation for them and recognizes them.

There are some cases to be considered here.
Abbrevations
     http is pronounced as h t t p itself whereas
   DSL is pronounced as disel. This is because Makedict was able to make a proper pronunciation for it.
    cdrom would be "c drom"

"Nesanga Nenena", is a telgu language word, but Vedics recognized it perfectly.

User must do some trail and error with other language words.

Currently we are still in the development phase.
We still have some features to be added to vedics like spelling words.
We have not yet written the installation program, so you could not find in repo. It will a week or two by the time we finish our exams and write the installation program.

And one more important feature to be added to the list is
8. No training required. we have tested it with different users using vedics. The 8 videos were made by 3 different people. A person can directly install, get familiar with commands and start using it. He does not require any training of voice. Vedics recognizes any voice.

2010/5/21 José Félix Ontañón <felixonta gmail com>

El 21 de mayo de 2010 11:01, bharat joshi <bharatjoshi1 gmail com> escribió:

Hi,

Ya we know about GVC, and we have tried it also.
You can say VEDICS is a super set of GVC.
Some of the key features of VEDICS are

Accuracy is much better as we use SPHINX-4.
File System Navigation - >Navigating files and folders is very easy.

Recognizes any thing- > Vedics is dynamic, in the sense it generates words and its pronunciation dynamically.For example, if we take a simple command like "run text editor", the front end changes as editor opens. Vedics generates a new list of words from the front end and produces its pronunciation and grammar files. This makes Vedics recognize any word. It can even recognize junk words like "hsjft"

We can pause and start VEDICS through voice using "stop listening" and "start listening" command. In gvc, people had to use mouse to do it. You can also quit Vedics through voice.
Works perfectly on Ubuntu 9.10 and 10.04.
Can access any element including checkbox, radio button, links, lists etc....
Popup menus like the one that opens on right click are also accessible.

I'm fascinating with the power of feature 3, it can generates pronun&grammar on-the-fly and context-based?
I suppose you mean that Vedics can recognize any word but in english languages, isn't? What about other languages?
Do we need a text/voice corpous to feed and training it in other languages?

And about feature 5, did you have some precompiled binaries or even debian packages for testing? I can't find in sourceforge any other thing than the svn repo.

Cheers!

2010/5/21 José Félix Ontañón <felixonta gmail com>

2010/5/21 Nischal Rao <rao nischal gmail com>

Hi,

I and some of my friends have created a speech assistant software for linux called VEDICS(Voice Enabled Desktop Interaction and Control System). Using this software the user can access any element found on the user's screen through speech. The user can also navigate the filesystem through speech.

We have created some demo screencasts of the software:

1. Accessing the gnome panel and application.
http://www.youtube.com/watch?v=WrVaJXtv0WU

2. Changing the theme and background.
http://www.youtube.com/watch?v=zRgX94qGj3g

3. Navigating directories and playing songs:
http://www.youtube.com/watch?v=kVQwAoeIavk

4. Running a slide show:
http://www.youtube.com/watch?v=JtzA8TFwvuI

5. Running default applications and window operations:
http://www.youtube.com/watch?v=iCEANbu8p50

6. Stopping and starting vedics:
http://www.youtube.com/watch?v=TLFtdrlt3lM

7. Creating and deleting files:
http://www.youtube.com/watch?v=_3CFAl22h2o

8. Navigating links:
http://www.youtube.com/watch?v=AufBaaJazKU

Currently the software doesn't support the dictation facility. However, we are planning to add this feature in the future.
The best part of this software is that it is speaker independent, no training is required and it can recognize words not present in the English dictionary.

You can find the source code at : http://sourceforge.net/projects/vedics/

Hi Nischal,

Congrats! The screencasts are amazing and, as i can see in sourceforge, it relies on at-spi for discovering the elements that could be commanded, isn't?

I suppose you know about gnome-voice-control, even both projects shares sphinx for speech recognition so, what do you think vedics differs from gnome-voice-control or improve it?

Cheers!

--
http://fontanon.org

_______________________________________________
gnome-accessibility-list mailing list
gnome-accessibility-list gnome org
http://mail.gnome.org/mailman/listinfo/gnome-accessibility-list

--
Regards,
Bharat Joshi

--
http://fontanon.org

--
Regards,
Bharat Joshi

_______________________________________________
gnome-accessibility-list mailing list
gnome-accessibility-list gnome org
http://mail.gnome.org/mailman/listinfo/gnome-accessibility-list

--
regards,
Nischal E Rao
blogs.sun.com/nischal

Join RVCE OSUM at http://osum.sun.com/group/rvceosum

References:
- VEDICS Speech Assistant
  - From: Nischal Rao
- Re: VEDICS Speech Assistant
  - From: =?ISO-8859-1?B?Sm9z6SBG6WxpeCBPbnRh8fNu?=
- Re: VEDICS Speech Assistant
  - From: bharat joshi
- Re: VEDICS Speech Assistant
  - From: =?ISO-8859-1?B?Sm9z6SBG6WxpeCBPbnRh8fNu?=
- Re: VEDICS Speech Assistant
  - From: bharat joshi

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]