[xml] parse from urls with python bindings

From: Andreas Pakulat <apaku gmx de>
To: xml gnome org
Subject: [xml] parse from urls with python bindings
Date: Sat, 10 Jun 2006 00:21:33 +0200

Hi,

here's another newbie-question: What function do I use to parse a HTML
or XML file from a URL? I'd like to do something like:

doc=libxml2.parseFromURL('http://www.google.de')

or at least xmlParseFromURL and htmlParseFromURL in case libxml2 cannot
decide which parser to use for this. 

I do see some functions in libxml2 module and 

doc=libxml2.htmlParseFile('http://localhost/','us-ascii')

even works. But it doesn't for www.google.de and I thought libxml2's
html parser is very forgiving? In fact letting lxml parse
http://www.google.de using the HTMLParser works fine.

Or do I have to download the file myself to a temp-file and feed that to
libxml2?

Andreas

-- 
You can rent this space for only $5 a week.

Follow-Ups:
- Re: [xml] parse from urls with python bindings
  - From: Andreas Pakulat
- Re: [xml] parse from urls with python bindings
  - From: Daniel Veillard

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]