Re: Maximum results to return from query



On 12/5/05, Olav Vitters <olav bkor dhs org> wrote:
> Searching for a bug can produce lots of results. Some queries can
> return all the bugs in the database. The script has a very idiotic way
> to protect against such queries (nothing after the ? in the URL or just
> buglist.cgi).
>
> A java program was requesting:
>   http://bugzilla.gnome.org/buglist.cgi?bug_id=
> This caused buglist.cgi to retrieve all bugs. I've blocked his IP &
> changed buglist.cgi to reject above query, but the java program already
> had 3 buglist.cgi processes running on window, each consuming lots of
> processor time (20min) & memory (180MB+).

Was their java program running core-bugs-today.cgi or another
braindead script that we have?  I've requested
  http://bugzilla.gnome.org/buglist.cgi?bug_id=
several times myself as well.  I didn't do it intentionally, but
rather because reports/core-bugs-today.cgi does it's own little query
to get a comma separated list of bugs, and then just appends them to
that above URL and automatically redirects--if no bugs were found (as
is the case just after 24:00 UTC), the appended list is merely an
empty string.  Taking a look at the code, this bug is still there
although it appears you fixed it when you ported that script to 2.20. 
We may have other scripts that are similarly braindead.  I think
boogle had this problem at one point (causing me to request that same
url...) before I added code to special case that and print an error
that no bugs were found instead.

> Ideally buglist.cgi should contain a better detection of such queries.

Well, I think special casing an empty query since it has happened so
many times makes sense.  Something more advanced would be welcome too,
but this would probably be about as simple as you get and an empty
query really ought to return an empty list.

> Another way would be to limit the number of bugs in the SQL. This isn't
> perfect as the java process would still return lots of results, but it
> is easy to implement. This is what I want to do now.

Sounds like it'd also be a good idea, but I really do think we should
also special case an empty query and make it return
nothing--especially since we have caused it so many times ourselves.

> My question: What is the maximum number of bugs you would ever want to
> see as a query result? I'm guessing 2000.

The only time I ever have thought that more than about 250 bugs would
be useful from buglist.cgi was when I was trying to get a count of how
many bugs of a particular type existed.  I bet others use it the same
way.  Given the existence of browse.cgi to satisfy most such basic
questions (and their ability to ask us when they have more advanced
questions), I think the limit could be a lot lower than 2000, though
2000 does definitely seem safe.

What'd be really cool is if we could have a way to have queries that
have taken longer than N minutes automatically be terminated.  I have
no clue how we'd do that, though.

Cheers,
Elijah



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]