Re: [xml] libxml2 2.7.1 breaks XML serialisation of HTML trees
- From: Stefan Behnel <stefan_ml behnel de>
- To: "Martin (gzlist)" <gzlist googlemail com>
- Cc: xml gnome org
- Subject: Re: [xml] libxml2 2.7.1 breaks XML serialisation of HTML trees
- Date: Wed, 10 Sep 2008 08:08:53 +0200
Hi,
Martin (gzlist) wrote:
On 08/09/2008, Stefan Behnel <stefan_ml behnel de> wrote:
there was a change in 2.7.1 (xmlsave.c, ~760) that prevents HTML documents
from being serialised in XML style...
...
If the current behaviour is wanted, what's the future way of achieving
this *without* temporarily modifying the document? (i.e. without breaking
thread concurrency)
I have been eyeing the other 28 bits of xmlSaveOption recently, mostly
to add a XML_SAVE_XHTML to go counter to the current XML_SAVE_NO_XHTML
that would unconditionally turn *on* the Appendix C rules without
needing one of the XHTML 1.0 doctypes.
Sounds fine.
Some other tweaks to like
XML_SAVE_XHTML_NO_META_CHARSET would perhaps also be good.
Why only for XHTML? The <meta> entry is either wanted or not, and it changes
the document on output, which is not always desirable. The libxml2 options
should say: "I want it added if it's not there" (which is the current
behaviour anyway) and "I do not want my document modified on output".
Would an
XML_SAVE_TEXT_HTML option to do the old sgmlish serialisation answer
your use case?
Doesn't sound like it. The problem is that I need to distinguish between a
serialisation as well-formed XML and a serialisation in HTML style
*independent* of the type of document. And I also need to do so in a way that
produces the same output across libxml2 versions. I wouldn't mind switching to
a different API based on an "#if LIBXML_VERSION ...", but I would still want
to get comparable output. lxml never used the xmlSave* API for exactly that
reason: the output changed heavily across the supported versions.
The change in 2.7.1 broke a whole bunch of doctests for lxml. I fixed some of
those, but users will run into the same problem.
Stefan
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]