Schemas, missing data



One problem we've been having a fair bit with bigboard is getting
AttributeError when some python code tries to access a property 
of a resource that it expects is there, but isn't.

The most common case of this is the case when we are offline and the
data wasn't previously cached, but I have the feeling that there may be
other cases, when we do a fetch to get extra properties for a resource,
then some code path causes us to try and read those attributes before
the fetch returns.

This mail thinks this through a bit ... not planning to dive into
coding immediately, but it has influences on stuff I'm currently working
on.

To solve this, we basically need some expectation on the client side of
what resources look like. In other words, schema files. I think
something simple like:

<m:schema m:xmlns="http://mugshot.org/p/system"; m:serial="1193060147">
    <class xmlns="http://mugshot.org/p/o/user";>
        <name m:type="+s" name="<Unknown>"/>
        <homeUrl m:type="+u?"/>
        <photoUrl m:type="+u?"/>
        <lovedAccounts m:type="r*"/>
        <contacts m:type="r*"/>
        <contactStatus m:type="i" default="0"/>
        [...]
    </class>
</m:schema>

Will work. It could later be extended to have documentation. We can
easily make the server dump a schema file for the registered DMO types.
(The default values aren't there on the server but could be added.)

Handling of missing and non-conforming values:

 - List valued properties default to an empty value for missing 
   properties

 - Missing properties for scalar values are treated as for optional
   properties of that type, even if the property is unknown.
   
   In Python:
     None for any missing properties
   In C:
     Default values for each non-NULL-able type: 0 for integers,
     FALSE for boolean values, etc.

   The server may specify default values in the schema for properties.
   Using default values for non-optional properties is encouraged. 
   Using default values for optional properties is discouraged.
     (Schema should be there to enhance offline behavior, not 
     change online behavior)

 - The data model bindings should reject property values that don't
   conform to the schema and treat them as missing.

Schema management:

 - Schemas are application expectations about the behavior of the 
   server, so should be shipped with the application, not the 
   data model engine. Also, for this reason, they should be 
   implemented in the binding libraries, not in the data model
   engine.

 - Adding a schema:
 
    model.add_schema(filename);
    model.add_schema("http://mugshot.org/p/schema";);
    model.add_schema("http://mugshot.org/p/schema";, 1193060147);

   First one adds a specific filename. The second one adds a schema
   from an URL, possibly with caching, which is useful for development
   purposes, but shouldn't be used in shipping apps.

   The third one is like the second, but specifies a minimum serial 
   number. This might allow using this form together with a system 
   installed schema file instead of shipping an application-specific
   copy.

 - If multiple entities (think app and libraries) add schemas for the
   same class, the one with the higher serial (determined by the serial
   of the schema file) wins.

- Owen


P.S.

 - A strict approach would be to say that missing non-optional
   attributes make the query that retrieves them return an error,
   but returning a best effort "here's everything we know" when 
   offline is better than giving the user a blank display because one
   minor property of one of the items in a big long list is not
   known.

 - I was thinking  originally that we should try to do something
   fancy:

    fetch(<resourceId>, "contacts [name;photoUrl]")

   Would say "OK, I found 10 contacts, but for 3 of them, we have no
   value for name, but it's a mandatory property, so we'll filter
   those contacts out and just return the other 7. However, that doesn't
   really work, since the Resource object returned to the app when
   the fetch succeeds is the *same* Resource object as that returned
   by:

    fetch(<resourceId>, "contacts [photoUrl]")

   And these same resources have the same 'contacts' attribute.

 - Throwing an exception for optional properties is probably more
   Pythonesque than returning None. Think dictionaries and KeyError.
   But I've already written a ton of:

    try:
        name = foo.name
    except AttributeError:
        name = None
    
   It just doesn't work out comfortably.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]