Cluehunting: Expanding The Future



The following is cut-and-pasted from my web site.  I am submitting this for
peer review to the group before publically advocating it.  Have at it, and
PLEASE PLEASE PLEASE read this all the way through *BEFORE* responding--many
of the later items address holes that the earlier ones open.

Note:  THIS IS MUCH MUCH MUCH easier to read online at
http://www.best.com/~effugas.  Future revisions will be posted there.

=======


WHERE CLUEHUNTING COMES FROM

Cluehunting is an advanced Expansion Agent, defined as a system that allows
the computer to search possible “expansions” throughout given contexts given
a “clue” by the user.  Clues are defined as segments of data(type
irrelevant) that the computer would be able to utilize to predict the final
contents of the user’s intention.  Expansions are the presumed intentions of
the user.  Finally, contexts are the “search space” that is being
scanned--the file system context, the launcher context, or even a
thesaurus/spell check context are all valid options.

It would be completely unfair to describe cluehunting as a totally original
concept--it stands, if you will, on the shoulders of giants.  Tab-completion
is the oldie, and as far as I know originated with the Unix shell tcsh,
though it’s also a hidden option in the NT command shell.  This technology
is quite file-system specific:  Enter as much as you know about a path,
starting with the root, and tab complete will expand what you type to fit.
For example--enter /usr/home/eff and hit tab, and you will be given the
first entry in /usr/home/ that begins with “eff”.  Some limited regular
expressions are allowed--for example, if I’m in the directory
/usr/home/effugas and type unzip *.zip, I will be able to tab through each
zip file in my home directory.  Very slick.

Tab Completion is nice, but it has it’s flaws.  First of all, tab has become
the de facto standard for “advance to next field” in GUIs, and there’s no
way I want to get rid of one of the best keyboard timesavers in existence.
Secondly, it searches files and only files.  There are other search contexts
that should be hit.  Finally, tab-complete provides no way to expand into
anything but a single entry--what if the didn’t want just one of the group,
what if the user wanted to expand into all entries that fit the given form?
In other words, instead of just one zip, all of *.zip was inserted?  Would
be logical in a number of situations.

Tab Complete’s newborn sibling, Autocomplete, was an innovation that began
at the much maligned UI shop known as Microsoft and was later adopted by
Netscape for its Communicator browser.  As Microsoft integrated Internet
Explorer and Windows Explorer, both the Run Dialog and the Web Open Dialog
possess Autocomplete functionality.  (Actually, Microsoft Word will also
Autocomplete anything you type that is related to a few known categories,
i.e. date, author name, etc., but I’ll deal with this later.)  So what does
this bring to the table?  Well, we see the beginnings of clue contexts
showing up here, since at first glance it appears that the run menu will
autocomplete files and the web browser will autocomplete web sites.  But
these are both searches of the same clue context--the history context, in
which things that have been typed before are called back to be expanded back
into reality.  And how does Autocomplete expand entries?  In the middle of
typing, inverted text will appear containing the contents of what the
computer is guessing the user is trying to get at.  This text will only
appear to the next valid level--http:// will expand to http://www.best.com,
but it will not expand to http://www.best.com/~effugas nor
http://www.best.com/~effugas/Personal/SILC/silc.html.  There’s no way to
really scroll through possible entries in this history-based
autocomplete--the first thing that matches will be matched to its first
level, and that’s all you get.  Worse, sometimes a delay in typing is
required to simply trigger an autocomplete.  Still, this functionality is
total joy, even with all of its warts.




WHATS NEW IN CLUEHUNTING

Cluehunting specifies the following advancements beyond present-day
expansion technology:
Universal Expansions
Inputstream Aware Expansion Styles
Application-Dependant Clue Contexts
Clue Context Overrides
Pluggable Context Servers
Regular Expressions
Batch Expansions
Cluelists
General Accessability
Definitions help, of course:
Univeral Expansions:  Expansion should be available in all interface
components.  The primary limitation of present expansion methods is that can
’t really be available everywhere.  Cluehunting is designed to allow every
interface construct to read the intentions of the user.  It is the purpose
of the next nine points to make sure that this works, and works well.
|
Inputstream Aware Expansion Styles:  Segmented streams of input data ought
to implement commanded expansion, while unified inputstreams may take
advantage of automatic expansion.  A little background is going to be
necessary to understand this.  First, You can’t outclass something you can’t
recognize the class of.  That being said, lets talk about Microsoft’s UI
department.  Take Microsoft Word 95/97.  Red and green spelling and grammar
warning underlines are excellent interface components.  They’re unobtrusive
enough to ignore in the heat of thought, yet available enough to make it
difficult to miss misspelled or inappropriate words.  I miss them any time I
type in anything else.  They enhance the feedback loop of the inputstream.
The inputstream is defined as the flow of commands from the user to the
computer as well as any information fed back along the same channels as the
input--for example, a clock in the lower right hand corner is not part of
the inputstream, but the characters that pop up in response to the
corresponding character being pressed on the keyboard is.  What does not
work in Word, however, is Autocomplete.  When I type Dan, I’m not always
talking about myself, and when I type August, I’m not always talking about
the present date.  I don’t want to have to interrupt my stream of thought to
correct Word--my concepts are segmented into words from sentences,
paragraphs, and full documents.  This contrasts sharply with the very
appropriate and useful usage of autocomplete for web sites, which have
addresses that are single-phrase and thus unified.  Therefore, while Word,
and any other segmented inputstream receivers ought to require a key to be
pressed before the phrase is expanded(though a graphical hint like a
different cursor would help), Netscape should attempt to expand
automatically.  NOTE:  Research is required to make sure this inconsistency
does not overly confuse users.  It is very possible that automatically
triggering an expansion in unified instances but delaying expansions in
segmented cases is utterly confusing to users.  In this case, I’d lean
towards an completely delayed expansion interface.

Application Dependant Clue Contexts:  Applications should search multiple
clue contexts appropriate to the active application context.  Strange words
coming from someone who worships consistency in user interfaces, but I
really think this is necessary.  Applications generate context, and all
clues should not expand from some single chosen source.  For example:
Suppose I enter the word “liffe” into a word processor.  The ideal word
processor would notify the user immediately and non-intrusively that the
word was mispelled.  Obviously, the appropriate clue context for a
misspelled word is to search through alternative correct spellings.
Multiple presses of the Continue Cluehunt keybinding would search through
multiple alternative spellings, until the user chose to press either the
Cancel Cluehunt keybinding(probably Escape) to revert to the misspelled form
or to press the Cluehunt Successful keybinding(probably Enter).  The user
could, of course, reselect the correctly spelled word, and this time search
through the default context for a correctly spelled word:  the thesaurus.
So, life would be replaced with various synonyms--or, the thesaurus dialog
could come up to provide a multidimensional search between life-as-vocation,
life-as-socialness, or life-as-complete-lack-thereof.  All that cluehunting
specifies is a precondition and a postcondition--dialogs do not violate
this.  It would be preferable if these weren’t modal dialogs, however--it is
rarely appropriate for the user to be locked out of his or her document.

Pluggable Context Indexes:  Clue contexts, either attached to an application
or independant, should register themself with a central index.  This index
of clue contexts would be categorized either by type or by owner
application, would have MRU(most recently used) lists, and would be
reconfigurable by the user.

Clue Context Overrides:  The user should be able to specify a specific clue
context to expand from, in either a proactive manner or a reactive manner.
Despite the fact that applications often have context that make sense, there
are times when the user has another context in mind.  For example, the user
should be able to access the Thesaurus context while saving a file, or the
filesystem context while documenting an application, or the web history
context while creating a web page of links.  This would be implemented with
a Set Clue Context keybinding which would modify the present word’s clue
context--a reactive override.  If the user had not yet typed a word, the
next word would be the recipient of the entered context--this would be a
proactive override.  Contexts would be registered upon install as per the
plug-in clue context interface, and manipulatable via a replacable dialogs.
Most probably, some degree of categorization would be appropriate, as well
as expansion on the clue context type itself.  (In other words, a box would
be given, and you’d type in Th and Thesaurus might come up).  Of course,
common clue contexts should be automatically recognized.  A user typing in a
path in any application, for example, should usually first trigger the file
system history context, and then the literal file system search context.
Similar results should await a user typing http://.  However, there is an
advantage to being able to select a context.  By selecting the Execute
Command context, the user could load any app directly from within any other
app and have the stdout reply be pasted at the cursor.  Much like ircii’s
/exec command, this would allow the contents of, say, an ls to be directly
pasted at the cursor.  Quite nice.

Regular Expressions:  Regular Expressions should be available for usage in
clue expansions.  Many users are familiar with using * to signify a
wildcard.  While the default expansion would, in general, presume a * at the
end of the provided clue and expand from there, there is no reason this is
necessary.  A user searching for dictionary words that end with “sort”
should be able to expand *sort into resort, consort, and plain old sort.
The only problem--how to differentiate between a clue containing a regex for
search purposes(execute context for ls -l *.gz) versus a clue that wants its
regex expanded before search(command history context for ls -l *.gz).  It’s
quite probable that most contexts will only fit one or the other, but I’m
unsure.  Email me if you think that a specific “begin regex” keybinding
would be necessary.

Batch Expansions:  All entries that fit the provided clue should be
available for simultaneous expansion.  Through an “expand all” keybinding,
the contents of all clues that fit the given context should be pasted at the
cursor.  This facilitates things such as “gunzip *.gz” being expanded into a
list of all files to be gunzipped, allowing the user to make sure the shell
was expanding the list correctly, among other uses.

Cluelists:  All entries that fit the provided clue should be listable in a
multiselectable sortable dialog.  In same ways, a basic version of this is
part of Microsoft Word 97:  Right click on a misspelled word and note the
four or five alternate correct spellings right there in front of you.  Most
GUI web browsers also allow you to search the typed-in history by clicking
on the down arrow at the far right of the entry bar.  Cluelists extend this
behavior by allowing the user a listmode or detailsmode(more windowspeak, so
shoot me) interface to select between multiple options for expansion.
Suppose the user wants to gunzip a couple of his or her .gz files.  Simply
typing gunzip *.gz inside of a cluehunt-enabled xterm and pressing the
“cluelist” keybinding would generate a window containing a list of all files
ending in “.gz”.  Then, the user would control-click or shift-click the
specific gzipped files desired to be expanded, press OK, and hit enter to
cause those files to be gunzipped.

General Accessibility:  All capabilities of cluehunting must be accessible
by mouse as well as by keyboard.  It is critical that Cluehunting be part of
a self-documenting interface, defined as an interface that bolsters the user
’s understanding and mapping of available options.  One major way to make an
interface self-documenting is to provide multiple paths to the same
destination that reference eachother.  Right-clicking on a batch of text
should either bring up a single menu item containing “cluehunt” or a list of
all the cluehunting options directly in the root right-click--research will
be necessary to see which is preferable.  Now, of course, each entry in the
right-click menu would contain the keyboard shortcut right-justified, and
the corresponding shortcut would be listed in the keybox(dev-note:  Will be
explained in upcoming proposal).  Pretty slick.



DEFAULT CLUEHUNTING KEYBINDINGS

Well, I’ll be blunt:  We’re still working on a default keyspace for GNOME
compliant apps.  However, the following are a preliminary set of keybindings
for cluehunting:
Cluehunt Forwards:  Alt-Shift-Right Arrow
Cluehunt Backwards: Alt-Shift-Left Arrow
Accept Cluehunt:  Anything that moves the cursor.  Enter has its
functionality modified to not clear the contents of the expansion.
Reject Cluehunt:  Esc
Expand All:  Alt-Shift-Enter
Scroll Through Cluelist:  Alt-Shift-Up and Down.



THE FUTURE OF CLUEHUNTING

Cluehunting is a devleoped proposal, but it’s still in development.
Research will be needed to check for areas of confusion and functionality.
Still to be determined:
How to notify the user that the existing text is expandable via a cluehunt?
Different cursors, different text colors, a note in the title bar...?
How to implement cluehunting?  One possible way is to simply have a
directory structure that corresponds to individual clue contexts and
contains standard stdin/stdout apps that take in the appropriate segment and
spit out a return value.  Implementation isn’t that much of an issue,
though--possibility is more relevant than methodology.





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]