Re: consistency in clue types



On Sun, 2003-07-20 at 18:22, Edd Dumbill wrote:
> Evo sends only 'textblock'.  Unfortunately, it always sends HTML with
> the entities escaped.  This seems to confuse dashboard.

This should be 'htmlblock'.

> Epiphany sends 'content' and 'title', which aren't documented.

Yes, these should be htmblock and textblock respectively.

> There are two things we're missing which I think we need for textblocks:
> content-type and charset encoding.  The way to get round charset is to
> mandate that everything comes in in utf-8, which doesn't seem so
> difficult.

Charset sounds important.  I'm not sure about content-type; it seems
like we could just have separate cluetypes for all the various content
types.

The issue that you've zeroed in on is that we're conflating the content
type and the content description in the cluetype.  

> Content-type is still reasonably important though, as we minimum need to
> differentiate between text/plain and text/html.

Well the current solution to this is to use textblock and htmlblock
cluetypes.  I'll add htmlblock to cluetypes.txt.

> I think the 'title' clue is probably quite useful: it's a much heavier
> hint to possible relevance than just a textblock, but on the other handy
> maybe that's what the relevance attribute is for (has anyone set this to
> anything other than 10 yet? will people ever?).

That is what the relevance attribute is for, yes.

I think once we get the dashboard continuously running on our desktops,
we can start tuning relevance and match quality.

Nat




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]