Re: Speeding up thumbnail generation (like multi threaded). Thoughts please.




On Fri, Aug 28, 2009 at 11:49 PM, Christian Hergert<chris dronelabs com>  wrote:
Hi,

What you mentioned is good information to start hunting.  Was the CPU time
related to IO wait at all?  Always get accurate numbers before performance
tuning.  "Measure, measure, measure" or so the mantra goes.

Perhaps a stupid question but what is a good way of profiling io? cpu
is easy but i've never done io.
In this case my hdd is certainly able to do more then 10 thumbnails
per second however i could see a potential issue when someone with a
slower hdd and a faster cpu then mine is thumbnailing a lot of images.
There the hdd will likely be the bottleneck.

You can do something really crude by reading from /proc/pid/* (man proc for more info). Or you could try using some tools like sysstat, oprofile, system-tap, etc. We really need a generic profiling tool that can do all of this stuff from a single interface. However, at the current time, I've been most successful with just writing one off graphing for the specific problem. For example, put in some g_print() lines and grep for those and then graph them using your favorite plotter or cairo goodness.

Unfortunately, the symptom you see regarding IO will very likely change
under a different processing model.  If the problem is truly CPU bound then
you will only be starting IO requests after you were done processing.  This
means valuable time is wasted while waiting for the pages to be loaded into
the buffers.  The code will just be blocking while this is going on.

And how can i test that?

ltrace works for simple non-threaded applications. Basically you should see in the profiling timings that one work item happens sequentially after the previous such as (load, process, load, process, ...)

I would hate to provide conjecture about the proper design until we have more measurements. It is a good idea to optimize the single threaded approach before the multi-core approach since it would have to be done anyway and is likely less complex of a problem before the additional threads.

What could be done easily is every time an item starts processing it could
asynchronously begin loading the next image using gio.  This means the
kernel can start paging that file into the vfs cache while you are
processing the image.  This of course would still mean you are limited to a
single processor doing the scaling.  But if the problem is in fact cpu
bound, that next image will almost always be loaded by time you finish the
scale meaning you've maximized the processing potential per core.

That sounds like a nice way to optimize it for one core. But could
there be any optimization possible in my case? since i have 100% cpu
usage for one core with just the benchmark.

You can't properly optimize for the multi-core scenario until the single-core scenario is fixed.

To support multi-core, like it sounds like you want, a queue could be used
to store the upcoming work items.  A worker per core, for example, can get
their next file from that queue.  FWIW, I wrote a library, iris[1], built
specifically for doing work like this while efficiently using threads with
minimum lock-contention.  It would allow for scaling up threads to the
number of cores and back down when they are no longer needed.

That sounds very interesting.
Just one question about the queue. Would it be better to thread the
application (nautilus) or the library (glib)? If your answer is the
library then the queue has to be passed from nautilus to glib. I would
say glib because all application have benefit from it without
adjusting there code.

I haven't looked at this code in detail yet, so I cannot confirm or deny. My initial assumption would be that the thumb-nailing API (again, I have no experience with it yet) should be restructured around an asynchronous design (begin/end methods) and the synchronous implementation built around that. And of course, nobody should use the synchronous version unless they *really* have a reason to.

FWIW, I would be willing to help hack on this, but I'm swamped for at least the next few weeks.

-- Christian


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]