[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [xml] libxml2 2.7.1 breaks XML serialisation of HTML trees
- From: Daniel Veillard <veillard redhat com>
- To: "Martin (gzlist)" <gzlist googlemail com>
- Cc: xml gnome org
- Subject: Re: [xml] libxml2 2.7.1 breaks XML serialisation of HTML trees
- Date: Sat, 27 Sep 2008 11:21:59 +0200
On Sat, Sep 27, 2008 at 12:58:50AM +0100, Martin (gzlist) wrote:
> On 25/09/2008, Daniel Veillard <veillard redhat com> wrote:
> >
> > Stephan, Martin,
> >
> > could you check the enclosed patch ? I'm commiting it to SVN head too
> > but it's probably easier to review that way.
>
> I built trunk and had a play around, this handles my use case, thanks!
>
> A couple of concerns, first it's not *totally* clear which options
> should be used together. Should at least be documented, and maybe
> clearly defined in the code as well.
>
> The patch has a bunch of additions like this:
> xmlSaveCtxtInit(&ctxt);
> + ctxt.options |= XML_SAVE_AS_XML;
> Move that common case into xmlSaveCtxtInit and overwrite after in
> xmlNewSaveCtxt for the exception of wanting to save as html?
No, the point is that by default you have no options, and to
force the XML_SAVE_AS_XML only on the old xmlDump* functions to
restore their behaviour. xmlSaveCtxtInit() should be kept neutral.
> Or put logic in xmlNewSaveCtxt to make the set of options on the ctxt sane?
> Or could the format parameter in those functions be safely upgraded to
> full options - the signature is the same and XML_SAVE_FORMAT is 1
> anyway, but I guess if people have been passing some other non-zero
> value for formatting that'd break compatibility.
I really think the current code is proper it express that those old
entry point force output as XML.
> Finally, having XML_SAVE_AS_HTML makes it seem like you could save any
> xml-flavoured-html document in non-xml-flavoured form, but that's not
> quite the case. One thing I found was that the overloading of
> XML_CDATA_SECTION_NODE to also be HTML_PRESERVE_NODE means the
> contents get output raw, see HTMLtree.c lines 838-843 in
> htmlNodeDumpFormatOutput.
Okay, maybe that need fixing, yes.
> > Basically it adds 3 parsing options, and for the old entry points
> > xmlDump* not xmlSave based it forces the XML_SAVE_AS_XML bypassing
> > the doc type in case of HTML documents. that should fix Stephan problem
> > and also provide ways to do things with xmlSave when available.
> > For the 'problem' of the added meta an XML_SAVE_IMMUTABLE option could
> > be added that sounds more generic, but i'm not adding this in the patch
> > to not complicate things.
>
> A don't-fiddle-with-the-tree option sounds like a possibility, though
> I do find the duplication of xml:lang to lang useful.
I see but the threading aspect of XML_SAVE_IMMUTABLE is an important
point IMHO.
Daniel
--
Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/
daniel veillard com | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library http://libvirt.org/
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]