Re: [guppi-list] Re: categorical data in Goose



On Mon, Dec 07, 1998 at 11:06:22PM +0100, Asger Alstrup Nielsen wrote:
> Two quick questions:  
> 
> 1) Is there any good reason why the categorical type should not be
> dynamic? 

Not really.
> I.e. why should it not be allowed to change it dynamically?  Specifically, when
> I discover that I need to have Copenhagen in the list of big cities, is there
> any good reason that this should be disallowed?  Similarly, I want to be able
> to delete London.

Adding things later on should be OK, I guess.  I made locking occur
by default, but there is no reason it has to.  It could just be left
as an option, in case someone wants to be sure that their categories
aren't messed with later.

Deletions are a more subtle issue.  Say that you create some Category
objects, then you create a two-way contingency table for counts in a
categorical data set.  (I'm working on contigency table classes now;
nothing ready for check-in, though...)  Well, if you start mucking
around with your categories, you will silently put your contingency
tables into an invalid state, as it may well contain some data
relating to "London".  This is why I made categories "persistant" ---
so that, once defined, they are always floating around in the
background, safe from being tampered with, deleted, etc.

On second thought, this is the problem with allowing additions.  My
contingency table of "Drug" vs "Side Effects" goes from being an r x c
table to being a (r+1) x c table when I change the category (after
creating the table).  How will my table know to update itself?  How
should it initialize the new row?  This is full of pitfalls...

I really think that the best and safest policy might be to only allow
you to create auxilliary data structures from "locked" categories.
And if this is the policy, you might as well lock them at cache-time,
or everyone will always be forgetting to lock their categories.  It
will be bad enough remembering to call complete_category()...

> 2) Would there be any foreseeable problem with deriving it from the DataType
> abstract base?

Probably not.  I'd have to look at the DataType again and think about
if there would be any problems.

> Regarding the GooseSet:  Last week, I did some work on this myself, but I
> haven't committed it to the cvs yet.  I'll try to integrate my setup with yours
> when I find the time.

Shouldn't be too hard, given how minimalistic mine is...



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]