Re: [xslt] XSLT transformation to Plain Text using Python bindings requires using children().serialize()?



On Fri, Aug 23, 2002 at 02:19:08PM -0400, Craeg K Strong wrote:
> Hello:
> 
> Thanks for the quick response...
> 
> Le grande pinguin wrote:
> > On Thu, Aug 22, 2002 at 09:00:05PM -0400, Craeg K Strong wrote:
> >>
> >>The problem is this:
> >>
> >>- the result of applyStylesheet always returns an instance of
> >>xmlDoc
> > 
> > Where is the problem? That's what is is supposed to return.
> 
> I understand, but this is confusing, especially for a newbie.  For example,
> I use XSLT to generate Python code from XML.  And I am
> supposed to retrieve my Python code from an object called "xmlDoc"???
> This is non-intuitive IMO.

Why? Because, maybe,  you miss an important point? XSLT is, as already
the name says, a way of _transforming_ xml documents - a transformer,
nothing more: you put an XML document (_not_ it's serialized bytestream
representation) in, and you get one out. What, btw, _would_ you expect
get retrieve? 

> I also sometimes generate HTML that is not XHTML compliant. This is,
> of course, not legal XML, and therefore strange to be retrieved
> from an object called "xmlDoc"  unless "xmlDoc" is meant to be
> viewed as "a document of some kind produced by libxml"
> (a rather liberal interpretation...)

It's not. The purpose of XSLT is the _transformation_ of one XML tree
to another -- you cannot produce invalid XML (well, of course, you can
allways fool the system, but don't complain if it breaks). What you get
back from a transformation is a tree representation, not a serialized
tree. You are free to serialize it (and Daniel did put some nice and
very convenient helpers in libxslt).

> >>- some of my stylesheets specify html or plain text
> >>
> >>-  for plain text, xmlDoc.serialize() prints
> >>out an XML header
> > Hmm, i just built a small test case and can _not_ verify this.
> > What's the version of your libxslt? Can you provide us with a 
> > little test case? Are your shure you didn't forget the 
> > 'omit-xml-declaration="yes"' attribute in the xsl:output tag?
> 
> I included my test case (xml, xslt, and python code) in the original message--
> let me know if you didn't get it via private email, I don't want to
> spam the list unecessarily...

This was a rethorical question ... you did forget to put it in.

[...]

> I am using xsl:output method="text".   I don't believe it is
> meaningful to say "omit-xml-declaration" when the method is
> text..(not sure what the spec says, though).

Ah, maybe you should read them, then. Using a chainsaw without
reading the manuals might be a one-time experience ;-)
The 'output-method' specifies the character escaping that's supposed
to happen, _not_ the presence (or absence) of an xml declaration, that's
what the 'omit-xml-declaration' is for. I suggest reading the specs
or a decent reference (M. Kay's "XSLT-Programmer's Reference" is probably
considered to be _the_ desktop reference by many in this list).

> Anyways, I tried
> serialize() both with and without "omit-xml-declaration" and I get
> <?xml version="1.0" encoding="ascii"?> regardless.

As Daniel points out in his recent post: 'xsl:output' and its arguments
are part of the xslt features, so you need to use xslt-specific functions
to get the wanted results (doc.serialize is libxml).

 > > 
> >>Perhaps there is some way to query the xmlDoc to
> >>find the content type?  Or should one always use
> >>children().serialize() instead?
> > 
> > What do you mean by 'content-type' ? That's a concept of HTTP
> > and not of XML. The content type is part of the semantic of a 
> > document (an text/html page _can_ be valid XML) and the library
> > has no idea about that. Or do you want to know the output type
> > (xml/html/text)?
> 
> Yes.  I want the output type-- something like the following:
> 
> if resultDoc.type == "document_html":
>              result = resultDoc.serialize('ascii', 1)
>          elif type == "document_xml":
>              result = resultDoc.serialize('ascii', 1)
>          elif type == "document_text":
>              result = resultDoc.children.serialize('ascii', 1)
> 
> Obviously, this code will not compile today, because
> the only choices are XML and HTML.  Am I missing something
> or is the API missing something?

If you use libxslt with the proper 'xsl:output' combination you
won't need this.

  Ralf



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]