[xml] NanoHTTP and HTTP User-Agent field



I'm using NanoHTTP as the basis of Advogato.org's blog aggregator with
quite a bit of success but recently ran into an interesting problem.
I've come across a website that blocks HTTP GET requests from agents
that don't provide a User-Agent request header field. 

I checked RFC-2616 and the User-Agent field is listed as a field that
SHOULD be implemented but is not strictly required. On the other hand,
as the maintainer of several website that are frequently abused by bad
user agents, I can see why someone might want to block agents that don't
identify themselves. So, I think NanoHTTP should provide some kind of
identification.

I don't see any obvious way of adding a User-Agent field in the
documentation for NanoHTTP. I'm currently calling xmlNanoHTTPInit(),
followed by a number of calls to xmlNanoHTTPFetch(), and finishing with
a call to xmlNanoHTTPCleanup(). It seems like the init call would be the
logical place to add an optional User-Agent value.

Does anyone have any suggestions (or examples) of how a User-Agent could
be added without modifying the libxml2 code? 

If there's no easy way to implement a User-Agent with the current
codebase, there seem to be two possible solutions:

1. Patch libxml2 to add a default user agent, perhaps something like:

 NanoHTTP libxml2 version 2.6.27

2. Patch libxml2 to allow a user-configurable agent string. This might
be done by adding a new argument to xmlNanoHTTPInit() or a new function
like xmlNanoHTTPSetUserAgent() (or maybe something more generic like
xmlNanoHTTPSetRequestHeader()?).

-Steve




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]