Re: ITagProvider


On 9/28/07, Kevin Kubasik <kevin kubasik net> wrote:
> Its important to realize that my focus is not on the implementation of
> a backend to store this information, but to provide a generic way for
> us to include 'tag' information from a variety of sources.

Often you can't separate implementation details from the list of use
cases, though.

Originally I wanted to do tags solely using extended attributes on
files.  There are some definite advantages to this, like the ability
to maintain tags when files are copied around.  But this decentralized
model does not lend itself to doing, for instance, tag clouds.
There's no way to get a list of all possible tags in a decentralized
model without walking the entire file system tree -- obviously a

So before worrying about parent-child tag relationships too much,
start off simple and refine the design as you go along.  What do you
anticipate the uses for tagging on the user side?  What uses would a
programmer want for tags, and how might this API look?  How do
different (broad) implementation strategies fit into this?  No doubt
you'll have to make compromises somewhere.

> Also if anyone knows of a smart way to take a bunch of
> internally-mapped Uri's and merge them with the existing result sets,
> I'm still getting some frustration on that point, while I've figured
> out the functional steps that the bitarray's serve (as in where to put
> one when I want to search an index etc) once I've run the query to
> fill/populate it, I'm not really sure of what I can do with a
> LuceneBitArray or BetterBitArray. Anyways, I'm sure I'll eventually
> get it, but help would save some painful slow debugging time.

I'm not totally sure what you're trying to do here, but I would
suggest (a) keeping the primary storage of tags totally separate from
the index and (b) dealing only with real ("external") URIs and pushing
changes out to the tag DB as needed.  That would probably require some
additional events or something to be added to the FSQ.

Or, if you're feeling particularly adventurous, rewrite the FSQ.
That's the biggest consumer of memory at this point and has problems
like being unable to search by parent directories.  I've described the
issues with it in more detail in a previous email, I believe. :)


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]