Re: [gnet-devel] GNet HTTP server class



Tim Müller wrote:
On Wed, 2007-11-28 at 14:13 -0500, Jeff Garzik wrote:

To that end, here is this afternoon's work, a working embedded HTTP server class, GSHTTP (Gnet Server HTTP):
 (snip)
Just like GNet's GServer class, this new GSHTTP class is a thin wrapper on top of GServer and GConn. It parses an HTTP request with the help of GURI, parses the HTTP headers, and then passes that result to the user's HTTP request function for processing. The user directly talks to a GConn object from that point on.

Nice! I've also been toying with the idea of writing a simple http
server based on GServer, but haven't gotten around to it yet.  What I
had in mind was a bit more advanced/different though (e.g. adding hooks
for paths/subpaths to serve files directly from disk or call callbacks
to generate content/partial content on the fly and such), something
geared towards simple use cases with few clients rather than something
really scalable.

That would be pretty easy to add to GSHTTP.

I had been thinking about a way to register a list of callbacks, each of which was associated with a static string or GRegex, to be matched against each incoming URI.

Then, like Apache's core module infrastructure, call -each- matching callback, until the end-of-list or a callback returns a status indicating that no further processing of that URI should occur.

It would _really_ be nice if we had a GLib-ish portable version of sendfile(2), which almost every OS (including Windows) supports. That would help facilitate your suggestion, having a GSHTTP helper function for serving files off disk. Design the interface correctly, and an OS-specific sendfile could be used if available, or regular file operations used if not.


I have attached ApacheBench results on a gigabit network between two fast computers, for 1,000,000 requests of a 4K file. The results are decent for a couple hours of work, but not great. There are WAY too many per-request memory allocations and copies. As a result, I was actually bounded not by network bandwidth but CPU usage -- the app was consuming 100% CPU (it is single-threaded) during the ApacheBench run.

Interesting.  Do you have a breakdown of where it spends most time?

No idea.

After GSHTTP, I wrote an epoll-based, non-GLib/GNet server. It allocated a static, writable 8K buffer to each connected TCP client. It was fairly straightforward to parse the URI and headers by overwriting end-of-line markers with nuls.

Another thing that probably hurt my GSHTTP implementation was copying each header, key and value, into a GHashTable. As I did in the non-GLib version, it is better to store a pair of pointers to C strings than copy all the headers.


GURI does too many string copies and allocations to be used on a
 per-HTTP-request basis, too, IMO.

Could be, I doubt the code was written with efficiency in mind, although

Actually it's pretty darn efficient, if you replace each g_strdup() with an in-situ nul termination. (I did this in my non-GLib HTTP server)


I don't really see how you'd easily get around all the strdups for the
different bits and pieces (unless you pass in a writable string and let
gnet munge bits in the middle into terminators maybe?).  Maybe it's just

Yep, precisely.


I'm wondering whether the GLib main loop is really suitable for handling
LOTS of file descriptors/clients efficiently (well, only one way to find
out I guess ...).  There's a patch to make it use epoll in bugzilla [1],
but it doesn't look like that's going to go in any time soon.

[1] http://bugzilla.gnome.org/show_bug.cgi?id=156048

Note my name in there ;-)

GLib main loop isn't as nice a epoll... but it's about the best you could do on a pre-epoll era of Unix APIs. It's basically what INN and some other Internet servers do in their core polling loop.

	Jeff




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]