Re: [xslt] LibXSLT adding annoying whitespace



On Tue, Jun 12, 2001 at 04:49:21PM -0700, Michael Nachbaur wrote:
> This won't work because I'm not processing HTML, I'm trying to output it.
> Here is an example that will illustrate my problem:
> 
> Example 1 (XSL):
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
>   <xsl:output method="xml"/>
> 
>   <xsl:template match="/">
>     <html><body><img/><br/><a><img/></a></body></html>
>   </xsl:template>
> </xsl:stylesheet>
> 
> Example 1 (Output):
> <?xml version="1.0"?>
> <html><body><img/><br/><a><img/></a></body></html>

  Rules for XML concerning white spaces are relatively clear,
except specific cases within the markup itself, all spaces are significant.
So XML serialisation routine won't add any.
  
> Example 2 (XSL):
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
>   <xsl:output method="html"/>
> 
>   <xsl:template match="/">
>     <html><body><img/><br/><a><img/></a></body></html>
>   </xsl:template>
> </xsl:stylesheet>
> 
> Example 2 (Output):
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
> "http://www.w3.org/TR/REC-html40/loose.dtd">
> <html><body>
> <img>
> <br>
> <a><img></a>
> </body></html>
> 
> As you can see, there is an inconsistancy in the way \n's are being
> handled...I don't believe the XSL specification states that whitespace be
> *added* when it isn't in the source XSL.

  Well, if you look at the example from the specification (okay it's
a non-nomative section)
   http://www.w3.org/TR/xslt#data-example
the second example builds an HTML output. It includes heads, body, etc ...
tags from the stylesheet template itself. It is heavilly indented but as
the specs says 'formatting' white spaces from the stylesheet should be removed
so basically the expected output would be <body><table border="1"><tr> ...
though the spec lists:

-------------
[...]
<body>
<table border="1">
<tr>
[...]
-------------

  So the fact that using an HTML serializing method adds white spaces
should not looks a surprize. Also as others have pointed out:

----------------
If the indent attribute has the value yes, then the html output method
may add or remove whitespace as it outputs the result tree, so long as
it does not change how an HTML user agent would render the output. The
default value is yes.
----------------

  However I don't think the current libxml2 HTML serializer allows 
to implement indent=no (the XML doesn't have this restriction). I will
work on the HTML serializer in the next few weeks and expect to fix this
then.

> I hope this clarifies my problem.  I'm surprised no one else has run into
> this situation before (it may be my environment, but I'm using the RPMs from
> xmlsoft.org, so I don't think its anything I'm doing to it).  I'm also
> positive it doesn't have anything to do with AxKit or XML::LibXSLT, since I
> can replicate this problem through "xsltproc".

What I can't understand is why you user agent renders:

<html><body><img/><br/><a><img/></a></body></html>

differently than

<html><body>
<img>
<br>
<a><img></a>
</body></html>

If you can tell me more about why this is different I may try to get
this fixed, but I really don't see why ! <br> has to be formated on a separate
block under the img and the anchor/img block has to be formated on another
one under the <br>, so I really don't see why and how the 2 extra lines
can impact the rendering in any way.

Daniel

-- 
Daniel Veillard      | Red Hat Network http://redhat.com/products/network/
veillard@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]