Re: [xml] Applying XSLT to HTML
- From: Nic James Ferrier <nferrier tapsellferrier co uk>
- To: Stefan Behnel <stefan_ml behnel de>
- Cc: Dmitry Dzhus <mail sphinx net ru>, xml gnome org
- Subject: Re: [xml] Applying XSLT to HTML
- Date: Mon, 02 Jul 2007 15:19:25 +0100
Stefan Behnel <stefan_ml behnel de> writes:
Dmitry Dzhus wrote:
My aim is to apply XSLT to some HTML document (which may be broken
just a little).
I'm using standard Python libxml2/libxslt bindings.
My code is:
mf_extract = libxslt.parseStylesheetFile("mf-extract.xsl")
doc = libxml2.readHtmlFile(url, None, libxml2.HTML_PARSE_RECOVER)
mf_extract.applyStylesheet(doc, None)
Applying XSLT results as if there were no content in `doc` tree at
all. Using `readFile` instead of `readHtmlFile` works fine as
expected.
I tried to `print doc` after using both `readHtmlFile` and `readFile`
and noticed that, given the input document is well-formed, the output
differs only in XML declaration at the very beginning.
As I understand (and `document.type` indicates), using `readFile` and
`readHtmlFile` results in different kinds of documents --
`document_xml` and `document_html` -- while applying XSLT is only
possible with `document_xml` one. Is there any way to convert
`document_html` to `document_xml`?
Try the recover methods.
--
Nic Ferrier
http://www.tapsellferrier.co.uk
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]