Re: Speeding up thumbnail generation (like multi threaded). Thoughts please.
- From: Alexander Larsson <alexl redhat com>
- To: Mark <markg85 gmail com>
- Cc: gtk-devel-list gnome org
- Subject: Re: Speeding up thumbnail generation (like multi threaded). Thoughts please.
- Date: Wed, 30 Sep 2009 12:42:35 +0200
On Wed, 2009-09-30 at 12:36 +0200, Mark wrote:
> On Wed, Sep 30, 2009 at 11:46 AM, Alexander Larsson <alexl redhat com> wrote:
> > On Tue, 2009-09-29 at 22:59 +0200, Mark wrote:
> >
> >> hehe that was the idea indeed ^_^ and i will continue with that.
> >> I will test the large factors tomorrow.
> >>
> >> For now i'm happy with 100% cpu usage on all my cores (4).
> >> with the code posted in my previous message i only had 70% cpu usage
> >> so there was a bottleneck and it wasn't the HDD nor the CPU.
> >> Now that's fixed with giving each thread more then one (5 actually)
> >> images of it's own before locking and refilling the queue of 5 so now
> >> there is 100% cpu usage in the multi threaded benchmark.
> >>
> >> http://codepad.org/PKnp69qW
> >
> > Ok, i tested this a bit, and my results are not the same as yours.
> >
> > I tested on a directory with 1348 jpeg files, each aroung 5 megapixels,
> > totalling 3.1 gig of data.
> > Before each test I ran (as root):
> > sync; echo 3 > /proc/sys/vm/drop_caches
> > This flushes the caches, for two reasons: make the tests comparable
> > (i.e. same cache status), and to make the test realistic (nobody
> > thumbnails 3 gig of files that are all in the cache).
> >
> > You test is scaling to size 200, which is not the thumbnail size (128),
> > but lets ignore that for now.
> >
> > // GLib Thumbnailing Benchmark
> >
> > There is a bug in the benchmark, where it saves the original pixbuf
> > rather than the thumbnailed one, making this very very slow. When I
> > fixed this i get this timing:
> >
> > real 3m40.876s
> > user 3m19.667s
> > sys 0m2.542s
> >
> > Same test but, using gnome_desktop_thumbnail_scale_down_pixbuf():
> >
> > real 3m34.784s
> > user 3m13.926s
> > sys 0m2.479s
> >
> > So, for me gnome_desktop_thumbnail_scale_down_pixbuf() is ~3% faster
> > (which makes some sense, as its using a simpler algorithm). Did you
> > compile your benchmark app with full optimization? (since you have an
> > in-line copy of the scale_down_pixbuf function this is required)
> >
> > (The rest of the tests are all run with gdk_pixbuf_scale_simple for easy
> > comparison.
> >
> > // Glib more rapid thumbnailing benchmark
> >
> > real 1m56.650s
> > user 1m24.030s
> > sys 0m2.622s
> >
> > Here we can see that the jpeg loading trick really helps us.
> >
> > //Glib threaded thumbnailing
> >
> > My machine has 2 cores, not 4 as yours.
> >
> > With the default 4 threads:
> >
> > real 2m2.194s
> > user 1m25.437s
> > sys 0m2.982s
> >
> > Changed to use two threads:
> >
> > real 1m53.783s
> > user 1m25.948s
> > sys 0m2.966s
> >
> > If we use the same number of threads as cpus we go slightly faster
> > (approximately 2.6% less time). However, if we use more things are
> > actually slower.
> >
> > I've got 4 gigs of memory, so not everything will fit in the cache, but
> > the caches would probably help a bit, to verify this i ran the same
> > two-thread example without blowing the caches first:
> >
> > real 1m36.681s
> > user 1m21.610s
> > sys 0m2.501s
> >
> > So, slighly better, and we can see that the real time is getting nearer
> > to the user time, which means that less time were spent waiting on disk.
> > However, i'm not sure how interesting a cached benchmark is. Nobody will
> > thumbnail the same files twice.
> >
> > Now, what does this mean for Nautilus....
> >
> > Well, nautilus loads the files using gdk-pixbuf io-based resizing, which
> > is essentially what "Glib more rapid" does. I.E. it uses the jpeg
> > loading trick and scales using gdk_pixbuf_scale_simple. It calls
> > gnome_desktop_thumbnail_scale_down_pixbuf() only when an external
> > thumbnailer returns an oversize result (i.e. very seldom).
> >
> > Given the above result this is not ideal. The ideal would be to use the
> > jpeg loader trick but then downscale with
> > gnome_desktop_thumbnail_scale_down_pixbuf(), although that is hard to
> > implement given the pixbuf APIs.
> >
> > Nautilus uses only one thread for thumbnailing, and upping this to the
> > number of cpus of the machine could gain us a slight advantage, at the
> > risk of starving the rest of nautilus by the increase in i/o traffic.
> >
> >
> >
> >
>
> Hi Alex,
>
> nice benching. i will fix that little issue in "// GLib Thumbnailing
> Benchmark" to save the thumbnail pixbuff.
> Then i hope the gnome_desktop_thumbnail_scale_down_pixbuf is really
> faster like it states. And if that's the case perhaps it's time to
> make a gdk_pixbuf_new_from_file_at_scale that uses
> gnome_desktop_thumbnail_scale_down_pixbuf.
Its faster because it uses a simpler algorithm, so its not generally
useful, however it might be nice to allow it optionally, like adding a
new filter type and let you pass in a filter to the scaling loader.
> note that the scaling performance is just a minor part of the complete
> picture. If the loading/saving of files could be made faster in any
> way that would give the most speed boosts.
Yes indeed. File load speed is very important here, but there really
isn't that much you can do on the app side.
> As for "full optimization" to make it worse.. all my benchmarks where
> ren in debug compile mode. So not representative if did for one test
> but i did that for all tests so that probably makes it fair again..?
Only if you compiled your gdk-pixbuf library with debug mode, and even
then its not certain because it has assembler versions of the scaler i
think.
It would be fair if you used the optimized version of
gnome_desktop_thumbnail_scale_down_pixbuf() from libgnomedesktop though.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]