Re: New developments on Caribou

From: Matteo Vescovi <matteo vescovi yahoo co uk>
To: Francesco Fumanti <francesco fumanti gmx net>
Cc: marmuta <marmvta googlemail com>, gnome-accessibility-list gnome org
Subject: Re: New developments on Caribou
Date: Fri, 07 May 2010 11:44:16 +0100

Francesco Fumanti wrote:

There is working ongoing to create a word prediction service over dbusfor the onscreen keyboard onboard. (onboard is the default onscreenkeyboard shipping with Ubuntu.)At some point, there was also talk to share it with Caribou. It usesn-grams language modeling. If you want to have a look at it, you canfind it in the word completion branch of onboard:
https://code.launchpad.net/onboard


I had a look at the onboard word-completion branch, great stuff!

I think there is scope to join forces between presage and onboard.

presage is architected to merge predictions generated by a set ofpredictors. Each predictor uses a different language model/predictivealgorithm to generate predictions.


Currently presage provides the following predictors:

ARPA predictor: statistical language modelling data in the ARPA N-gramformatgeneralized smoothed n-gram statistical predictor: generalized smoothedn-gram statistical predictor can work with n-gram of arbitrary cardinality

recency predictor: based on recency promotion principle

dictionary predictor: generates a prediction by returning tokens thatare a completion of the current prefix in alphabetical orderabbreviation expansion predictor: maps the current prefix to a token andreturns the token in a prediction with a 1.0 probabilitydejavu predictor: learns and then later reproduces previously seen textsequences.

A bit more information on how these predictors work is available here:http://presage.sourceforge.net/?q=node/15

It sounds like the language model and predictive algorithm used in theonboard word-prediction branch is an ideal candidate to be integratedinto presage and become a new presage predictor class.

presage could then be the engine used to power the d-bus predictionservice, offering the predictive capabilities of the onboard languagemodel/predictor, plus all the predictors currently provided by presage(all of which can be turned on/off and configured to suit individual needs).

The presage core library itself has minimal dependencies: it pretty muchonly needs a C++ runtime and sqlite, which is used as the backing storefor n-gram based language models (this ensure fast access, minimummemory footprint and no delays while loading the language model in memory).

For details about the word prediction service, please contact marmutathat did nearly all the work about the word prediction service.

I'll follow up with marmuta to discuss the feasibility of making thishappen and work out the technical details, in case there is consensus togo ahead with this.



Cheers,
- Matteo

Follow-Ups:
- RE: New developments on Caribou
  - From: David Colven

References:
- New developments on Caribou
  - From: David Pellicer
- Re: New developments on Caribou
  - From: Francesco Fumanti

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]