Re: How best to fetch a web page...



On Tue, 2005-01-25 at 23:36 -0500, Freddie Unpenstein wrote:
Why not use libcurl? You can get much more info about your
connection.

libcurl even provides examples for how to use libcurl with a Gtk+ app.
Although I must admit their example should work fine for simple things
becareful not to abuse what the example demonstrates (the use of
gdk_threads_enter/leave) or a I fear it will come back to bit you!

Hmmm.....  libcurl does seem to be fairly popular...  Shame it doesn't
come with a gtkcurl wrapper library, or something.  I'm not even
fetching a web page, just a straight text file sitting on a web server

A "Web Page" is a shorthand for a representation of a resource
identified by a URI/IRI.  A remote Web server could return any
representation it likes, so you need at least to handle the case
where you get back something other than the text/plain you hoped for.
"handle" could simply be "print a clear, descriptive error message",
of course.

Some circumstances where this could happen include
- someone modifies the CGI script to return HTML instead of text :-)
- the server runs out of memory and sends an error code along with
  an HTML-encoded description
- someone runs your client against a different Web server
etc etc etc.

You could also get back a text file in an encoding other than (say)
UTF-8, if there's an intermediate proxy that transcodes.

Having said all that, there are tons of C, C++, Perl, Java etc etc
libraries and interfaces for negotiating with a Web server.

You can maximise the chance of getting text back using an Accept
header, e.g. Accept: text/plain, */*

Liam

-- 
Liam Quin, W3C XML Activity Lead, http://www.w3.org/People/Quin
Pictures from old books: http://www.holoweb.net/~liam/pictures/oldbooks/
IRC (chat) programs: www.ircreviews.org/clients/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]