Re: New developments on Caribou
- From: Matteo Vescovi <matteo vescovi yahoo co uk>
- To: Francesco Fumanti <francesco fumanti gmx net>
- Cc: marmuta <marmvta googlemail com>, gnome-accessibility-list gnome org
- Subject: Re: New developments on Caribou
- Date: Fri, 07 May 2010 11:44:16 +0100
Francesco Fumanti wrote:
There is working ongoing to create a word prediction service over dbus
for the onscreen keyboard onboard. (onboard is the default onscreen
keyboard shipping with Ubuntu.)
At some point, there was also talk to share it with Caribou. It uses
n-grams language modeling. If you want to have a look at it, you can
find it in the word completion branch of onboard:
https://code.launchpad.net/onboard
I had a look at the onboard word-completion branch, great stuff!
I think there is scope to join forces between presage and onboard.
presage is architected to merge predictions generated by a set of
predictors. Each predictor uses a different language model/predictive
algorithm to generate predictions.
Currently presage provides the following predictors:
ARPA predictor: statistical language modelling data in the ARPA N-gram
format
generalized smoothed n-gram statistical predictor: generalized smoothed
n-gram statistical predictor can work with n-gram of arbitrary cardinality
recency predictor: based on recency promotion principle
dictionary predictor: generates a prediction by returning tokens that
are a completion of the current prefix in alphabetical order
abbreviation expansion predictor: maps the current prefix to a token and
returns the token in a prediction with a 1.0 probability
dejavu predictor: learns and then later reproduces previously seen text
sequences.
A bit more information on how these predictors work is available here:
http://presage.sourceforge.net/?q=node/15
It sounds like the language model and predictive algorithm used in the
onboard word-prediction branch is an ideal candidate to be integrated
into presage and become a new presage predictor class.
presage could then be the engine used to power the d-bus prediction
service, offering the predictive capabilities of the onboard language
model/predictor, plus all the predictors currently provided by presage
(all of which can be turned on/off and configured to suit individual needs).
The presage core library itself has minimal dependencies: it pretty much
only needs a C++ runtime and sqlite, which is used as the backing store
for n-gram based language models (this ensure fast access, minimum
memory footprint and no delays while loading the language model in memory).
For details about the word prediction service, please contact marmuta
that did nearly all the work about the word prediction service.
I'll follow up with marmuta to discuss the feasibility of making this
happen and work out the technical details, in case there is consensus to
go ahead with this.
Cheers,
- Matteo
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]