Re: newbie wants data from google scholar
- From: Brian Lavender <brian brie com>
- To: Rudra Banerjee <bnrj rudra yahoo com>
- Cc: libsoup-list gnome org
- Subject: Re: newbie wants data from google scholar
- Date: Thu, 27 Sep 2012 09:54:46 -0700
Rudra,
First, you need to quote your argument so that the shell does not
interpret your argument.
$ wget "http://scholar.google.co.uk/scholar?q=albert+einstein%2B1905&btnG=&hl=en&as_sdt=0%2C5"
Second, I suggest you switch to Python. It has a nice command shell that you
can try these things out.
The reason the instructions for libsoup look like they are written in
Latin is because they are written in "Latin". And, therefore you need
to read "Latin". Before you even start with libsoup, you should start
with libglib developer guide and write small programs.
http://developer.gnome.org/glib/2.32/
I assume you already know how to code in C. If you don't, then go the
python path. Here is an example
$ python
>>> from HTMLParser import HTMLParser
>>> import urllib2
>>> response = urllib2.urlopen('http://python.org/')
>>> html = response.read()
>>> print html
parsing html left as an exercise for you.
http://docs.python.org/library/htmlparser.html
You can of course put these in an python program. I think you will
get a lot more traction going this path.
On Thu, Sep 27, 2012 at 10:47:13AM +0100, Rudra Banerjee wrote:
> On Thu, 2012-09-27 at 07:48 +0200, Emmanuel Rodriguez wrote:
> >
> > Once you have downloaded a web resource, most likely an HTML webpage
> > you're on your own.
> Downloading google scholar have some problem as well:
> $ wget http://scholar.google.co.uk/scholar?q=albert+einstein%
> 2B1905&btnG=&hl=en&as_sdt=0%2C5
> [1] 18552
> [2] 18553
> [3] 18554
> [2]- Done btnG=
> [3]+ Done hl=en
> [rudra@roddur ~]$ --2012-09-27 10:42:29--
> http://scholar.google.co.uk/scholar?q=albert+einstein%2B1905
> Resolving scholar.google.co.uk... 173.194.41.113, 173.194.41.114,
> 173.194.41.115, ...
> Connecting to scholar.google.co.uk|173.194.41.113|:80... connected.
> HTTP request sent, awaiting response... 403 Forbidden
> 2012-09-27 10:42:30 ERROR 403: Forbidden.
>
> ^C
> [1]+ Exit 8 wget
> http://scholar.google.co.uk/scholar?q=albert+einstein%2B1905
>
> Can you kindly hint me the source of error?
>
> _______________________________________________
> libsoup-list mailing list
> libsoup-list gnome org
> https://mail.gnome.org/mailman/listinfo/libsoup-list
--
Brian Lavender
http://www.brie.com/brian/
"There are two ways of constructing a software design. One way is to
make it so simple that there are obviously no deficiencies. And the other
way is to make it so complicated that there are no obvious deficiencies."
Professor C. A. R. Hoare
The 1980 Turing award lecture
[
Date Prev][Date Next] [
Thread Prev][Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]