Reviving emblems (strategy)



So, I said last weekend that I'd take a look at adding emblems
functionality back to gnome-bugzilla. It's pretty central to my work to
be able to take a look at the bug-list for gnome-shell and see what bugs
are sitting there with a patch that needs review.

(Olav pointed out that it *is* possible to get a list of bugs with a
particular attachment status out of query.cgi if you know what you are
doing; but emblems are nice in any case...)

The implementation of Emblems that Olav and Thomas Thurman came up with
earlier is:

http://git.gnome.org/cgit/bugzilla-newer/commit/?id=0503780a3da4e31580872dce9b9f6edc7222201d

What it does is actually adds a new 'emblems' column to the bugs table
that has one character per emblem: 'P' or 'L' or 'PL'.

This approach is highly efficient but has a couple of downsides:

 * Because the emblems column has to be recomputed when an attachment
   is modified or when the keywords are changed, it's somewhat intrusive
   and hard to do as a Bugzilla extension.

 * Adding a field to an existing table in a Bugzilla extension
   definitely looks awkward.

 * Changes to the bugzilla configuration that change what emblems should
   be applied to each bug:

   - Installing an "emblems" extension on a server for the first time
   - Adding emblems for new keywords
   - Changing what patch status should count as an "open" patch
  
   Require recomputing the patches for the entire database; this is
   going to take "a while" (no idea how long, say half an hour)

Removing the need to denormalize for keyword emblems is pretty easy -
because it's already denormalized! keywords are double represented as:

 - The keywords column of bugs ("," separated list)
 - The keywords table

So, we can just grab the keywords column for all the bugs we are listing
in buglist.cgi and do the conversion of keyword to emblem in buglist.cgi
template code.

But what about the patch emblem? If we don't have it denormalized -
either with the emblems column or with a special has_patches columns,
we're going to need to add an subselect to every buglist.cgi query.

So, the next question is, how long does this take? It's a little hard to
get good numbers on the question, but to get a rough idea, I ran the
following query:
 
SELECT 
       bug_id, short_desc, keywords,
       EXISTS (SELECT * 
               FROM attachments 
              WHERE attachments.bug_id = bugs.bug_id
                AND attachments.ispatch = 1 
                AND attachments.status IN ('none', 'accepted-commit-now', 'reviewed')
             ) AS has_patches  
  FROM bugs 
 WHERE product_id=:product_id AND
       bug_status IN ('UNCONFIRMED', 'NEW', 'ASSIGNED', 'REOPENED') 

Without and with the has_patches clause against a couple of products and timed
the difference:

 evolution (4214 bugs): 0.30 => 0.34 seconds
 gtk+ (2952 bugs): 0.07 => 0.10 seconds
 gnome-shell (131 bugs): 0.00 => 0.00 seconds

Timing was done by setting 'pager cat > /dev/null' and running the queries in 
the mysql console on bugzilla.

So, we're adding 30-40 millseconds of database work to really big queries, and
a few milliseconds or less to smaller queries. For comparison, a similar GTK+ 
query takes about 20 seconds to load in my browser if I run it against the 
bugzilla web interface (adding in the rest of the database work it does,
the page generation, the web traffic, and the rendering time in the browser.)

My basic feeling is that's the 30-40 millseconds is pretty much noise, and the
overhead is worth it to get a useful feature back without a lot of complexity.

But it's sort of a question of philosophy - if buglist.cgi is something we
consider as critical to be as fast as possible - and we've pulled out all the
stops already, then even a few milliseconds of extra time might be too much.

Feedback appreciated.

- Owen




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]