[xml] parse from urls with python bindings



Hi,

here's another newbie-question: What function do I use to parse a HTML
or XML file from a URL? I'd like to do something like:

doc=libxml2.parseFromURL('http://www.google.de')

or at least xmlParseFromURL and htmlParseFromURL in case libxml2 cannot
decide which parser to use for this. 

I do see some functions in libxml2 module and 

doc=libxml2.htmlParseFile('http://localhost/','us-ascii')

even works. But it doesn't for www.google.de and I thought libxml2's
html parser is very forgiving? In fact letting lxml parse
http://www.google.de using the HTMLParser works fine.

Or do I have to download the file myself to a temp-file and feed that to
libxml2?

Andreas

-- 
You can rent this space for only $5 a week.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]