Re: [xml] libxslt and source document modification



On Wed, May 30, 2001 at 05:44:01PM +0100, Gary Pennington wrote:
Hi,

I'm using libxml2-2.3.9 and libxslt-0.10.0 on a Solaris 8 box.

I'm using libxslt to transform an XML document into readable formatted
text, that part appears to be working fine. However, after I've done the
transformation I save my original XML document away (using
xmlSaveFileEnc) and find that the DOCTYPE declaration from my document
has disappeared along with some empty text nodes.

1. Why does libxslt modify the source document? I can partially work
around this by copying the document recursively for each transformation,
but that would be slow and awkward (especially when I want to transform
document fragments rather than whole documents).

  I think it's related to:
    - white space handling, they sometines need to be stripped, sometimes
      not depending on the stylesheet. this could be done by ignoring them
      in the source document, but it's a better solution to cleanup the
      input (avoid redoing the tests multiple time).
    - when applying the default templates there is some cleanup done.
      like in xsltDefaultProcessOneNode unprocessed nodes are simply
      stripped down. It simplifies counting and reduce the number of
      nodes meeding to be processed (especially if there is multiple
      passes like for TOC generation).
    - though it's not supported yet the document can embed its own
      stylesheet, and the current processing is based on the assumption
      that the input staylsheet tree will be heavilly modified.

2. If libxslt must modify the original source document, why is it
causing the DOCTYPE declaration to disappear?

  Because that's one of the numerous nodes which are not part of the
XPath models and those are simply removed. You won't find any entity
reference node either.

I'm happy to start debugging and tracing, but I just thought that
I would check if this was a known problem first.

  It's not really a problem, it is only if you expected that the input
should not be modified. Keeping the input unmodified would probably
lead to a lot of changes, maybe not that many, but you may have a lot
of debug to do, it won't be a trivial change.

Daniel

-- 
Daniel Veillard      | Red Hat Network http://redhat.com/products/network/
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]