Re: [Banshee-List] SQLITE performance

From: "gnomeuser gmail com" <gnomeuser gmail com>
To: banshee-list gnome org
Subject: Re: [Banshee-List] SQLITE performance
Date: Wed, 26 Oct 2011 15:54:19 -0200
2011/9/27 Bertrand Lorentz <bertrand lorentz gmail com>:
> Hi,
>
> A few comments in the text below.
>
> On Mon, Sep 26, 2011 at 7:34 PM, gnomeuser gmail com
> <gnomeuser gmail com> wrote:
>> 2011/9/26 olivier dufour <olivier duff gmail com>:
>>> Hello,
>>> I just want to know if some people has ever worked on SQL query to optimise
>>> long time query.
>>> I see few bugs open for that (632618, 638482 540835, ...).
>>> Most used database (Oracle, sybase, sqlServer, ...) is very scalar and we
>>> have difficulty to query the DB only with table bigger 100K or even 1M.
>>> So Why banshee query is slow with a 10k song lib?
>>> With search in log with regexp (Executed in [5-9][0-9][0-9]), I only get SQL
>>> which take more that 500ms to run.
>>> And it is always the same case.
>>> So I see few perf improvement tips, we can apply in code:
>>> 1) ORDER BY RANDOM ==> RAND() * MAX(ID)
>
> I think there a reason why we're using ORDER BY RANDOM, maybe to make
> sure we don't play the same track ?
>
>>> 2) SmartPlayListSource Refresh: delete and recreate cache. Why not just
>>> register event to only update track which have changed.
>
> I'm not sure I understand what you mean.
>
>>> 3) call ANALYSE (ServiceManager.DbConnection.OptimizeDatabase ()) more often
>
> ANALYZE just compute basic statistics for each table and store the
> data in the sqlite_stat1 table. So it will only have a real effect if
> the number of rows has changed significantly since the last time it
> was called.
> That's what the following code does :
> http://git.gnome.org/browse/banshee/tree/src/Core/Banshee.Services/Banshee.Database/BansheeDbConnection.cs#n120
>
>>> I see few tips here
>>> too: http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html
>>> Maybe we could use them...
>>> For example, replace (x GLOB 'abc*') can be replaced by (x >= 'abc' AND x <
>>> 'abd').
>>> Maybe it will improve search query.
>>> The worse use case seems to be importing and searching. so we must focus on
>>> them.
>
> I've seen several cases where a smart playlist like 'Unheard' that
> needs to be refresh on each track change created a significant CPU
> load. So any improvement there would be great.
>
> For anyone who wants to work on performance, a general piece of advice
> in 3 words :
> Measure. Measure. Measure.
> Before you make changes, while you make changes, and after you've made changes.
>
> A good tool for that is the performance testing tool created by
> Gabriel, which can be found in the test/ folder in the Banshee source
> tree.
> Which reminds me that it's been a while since I last ran it...
>
>> When we were at the Hackfest after FOSDEM Alan McGovern cooked up a 10
>> sec patch that might help a bit. It was unfortunately against an older
>> Hyena and Gabriel didn't seem convinced it was going to make a big
>> difference.
>> Regardless I just found it in my cache of old files if anyone wants to
>> poke at updating it and collecting some numbers.
>
> It looks like updating the patch would be easy, but I'm not sure I
> understand the goal of the patch: it seems to me it make all queries
> run on the main thread.
> Anyway, I'd be happy to see numbers about it, see above ;)

If I remember correctly the aim was to send of all read queries to
SQLite in parallel as that according to Alan is safe, whereas writes
have to be serialized. I could be utterly wrong though.
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]