Re: GNOME | Request for Bugzilla data



2013/7/25 Andre Klapper <ak-47 gmx net>

Mozilla have recommendations for researchers at
https://bugzilla.mozilla.org/page.cgi?id=researchers.html
and offer a sanitized MySQL dump (without attachments and secret
tickets) at http://people.mozilla.com/~mhoye/bugzilla/ .
Would it be worth if I asked Mozilla for steps how to create such a
dump?

Sure, we can ask them for a way to build a dump without secret tickets or private data. Will you go ahead and do that?
 
For the time being that researchers crawl GNOME Bugzilla and that we
don't have a dump:
What would be acceptable latency values to *not* get IP addresses
blocked, and UTC times of the day where there's less traffic anyway?
(Actually I'm asking this on behalf of a university professor.)

Currently, all requests that exceed the amount of 1500 hits per hour get banned (an hit means an entry on the relevant apache log in the format "IP date GET PATH"). We had a few cases of people not keeping a cache of the static html / css files that resulted in a ban after a few minutes cause their browser requesting the same static files at each request.

What we can do now is adding a few exceptions to the htaccess file that gets populated by our banning script. That said most of the GNOME developers are either from EU (mainly GMT+1) or from the eastern coast of the US (GMT-5), so I would say any time between 1-2 o'clock AM to 7-8 AM. We should probably ask these researchers to don't crawl the website at the same time if they plan to do so in the future, maybe limiting them to one per night.

Someone else might have another better idea though ;)

--
Cheers,

Andrea

Debian Developer,
Fedora / EPEL packager,
GNOME Sysadmin,
GNOME Foundation Membership & Elections Committee Chairman

Homepage: http://www.gnome.org/~av


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]