Re: Proposal for bookmarks/history database



On Thu, 2005-11-17 at 15:44 +1100, Peter Harvey wrote:
> > I recommend putting all metadata into a single table (as galeon-sqlite
> > does), since it's WAY, WAY simpler.
> 
> I'm mainly concerned about speed in that system though. Will it really
> be fast? This is where my naivety wrt databases comes in. I don't trust
> tables that involve millions of strings. :)

I wouldn't worry about it. SQL provides tons of opportunities for
optimization. My suggestion is to try putting all metadata in a single
table, and if speed becomes an issue, worry about it later.

Keep in mind that Storage did something much more complicated than this,
and it was still fast enough, before optimization.

> > I just don't see what EphyDb has to offer me, the extension author. The
> > more time you spend developing an API for me, the more time I'll spend
> > figuring out the workaround.
> 
> I think that for *most* extensions, and especially for most of Epiphany,
> it'd be nice to avoid having SQLite calls and SQL statements littered
> about.

Lots of people are comfortable with SQL. The neat thing about it is that
while it will implement the whole bookmarks/history thing in a
relatively fast way, it's also flexible enough to do queries you would
not expect. That's why I started salivating when I pictured a clean DB
for Epiphany.

> I think the easiest solution to what you've described is to have ephy_db
> provide the SQLite handle to extensions that really want to do something
> more.

Yes, that's a possibility and it will certainly open up doors (and break
windows). The downside is that with all relations in a single table,
there's a barrier to entry in that SQL-comfortable developers will have
to come to grips with a non-standard design.

> > I'd guess that's because you're not comfortable with SQL; it seems like
> > a lot less work to me. But to each his own.
> 
> Hehe, no I'm not a fan of SQL. The lack of event mechanisms really
> bothers me. I've avoided every database subject at uni. Took logic
> instead.

Actually, there is an event mechanism, in the form of "triggers". The
basic idea is that when a row of a table is modified, a certain block of
SQL code is executed with that row as parameter.

There's another extensibility mechanism, called "user-defined
functions". You can create an arbitrary function (in C/C++) which can be
called from SQL code.

Put them together, and you could put together signal emissions from SQL.
I don't think it would be particularly difficult. I don't think it would
be particularly useful, either. I mean, the way I envision it, if
extensions were to modify the topic relation table, they should do it
through the bookmarks API and not through the database directly. I would
expect putting signal emission at this low a layer would give a minor
but non-zero slowdown, and I know we want this stuff as fast as
possible.

> > In the long run, I can see that we started off with vastly different
> > mentalities. Neither is right or wrong; I don't know how much I can help
> > you with your implementation, beyond suggesting that you shouldn't
> > really worry about extensions too much.
> 
> When it all falls to bits then I'll come begging for help. :)
> Particularly performance-wise where I have no idea.

I'll be here for ya! SQLite isn't a fantastic performer in that it's
missing many optimizations of more powerful databases. So writing
queries involves trying many different approaches and finding out which
one is fastest.

But even though it's not the fastest, it still implements all sorts of
optimizations which EphyNode can't. I expect it to absolutely slaughter
EphyNode in terms of speed in history. Our most important operation is
probably searching for URLs which start with the given string; put an
index on the URL field, and this will be VERY fast. (It'll be a B-tree
lookup in memory, with all data coming from a single node or adjacent
nodes. Our next operation, a search on a URL with one extra letter, will
be even faster because those nodes will be guaranteed to be in memory.)

(Oh, and if you ever find a query acting hundreds or thousands of times
too slow, come to me. I've seen speedups on the order of magnitude of
10000x by simply adding an index and rewriting a query.)

-- 
Adam Hooper <adamh densi com>

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]