[xslt] output escaping (again!)



Hi all;
  I've found the discussion from November, but I guess I still don't
get the justification for the way output escaping is being done.

As a concrete example, given the xml:
 <foo data="data:text/plain,Hello"/> 

and template:
 <xsl:template match="foo">
    <a href="{@data}">Ha</a>
 </xsl:template>

I get:
 <a href="data:text/plain%2CHello">Ha</a>

How can I get:
  <a href="data:text/plain,Hello">Ha</a>
?  
Is there some trick I'm missing?
It would seem that disable-output-escaping="yes" doesn't
apply to attributes.

*****

I'll grant that it is convenient 95% of the time to have the escaping 
done, but then how do you handle the other 5% ?  Besides, it seems to 
me that it violates the specs.

The XSL spec, "section 16.2 HTML Output Method" says

   The html output method should escape non-ASCII characters in URI 
   attribute values using the method recommended in Section B.2.1 of 
   the HTML 4.0 Recommendation.

Obviously, we're not talking about non-ASCII here.  In the previous
discussion it was noted that "," is a reserved character.
However, the XSL spec doesn't say that reserved characters should be
escaped.  Indeed, from rfc2396,

   2.4.2. When to Escape and Unescape
   A URI is always in an "escaped" form, since escaping or unescaping a
   completed URI might change its semantics.  Normally, the only time
   escape encodings can safely be made is when the URI is being created
   from its component parts; each component may have its own set of
   characters that are reserved, so only the mechanism responsible for
   generating or interpreting that component can determine whether or
   not escaping a character will change its semantics. Likewise, a URI
   must be separated into its components before the escaped characters
   within those components can be safely decoded.

which suggests that processors like xslt should keep their paws off!

Indeed, ":" is a reserved character too! Why didn't it get escaped in
the above example? Hmm, just for fun, I tried:

   <foo data="data:text/plain,Hello=Goodbye?Bah;"/>
and I get
   <a href="data:text/plain%2CHello=Goodbye?Bah;">Ha</a>
Yet, "=", "?" and ";" are also reserved characters!
Perhaps libxslt is attempting to escape reserved characters
when (it thinks) they yield an invalid uri ?  If so, then 
it's understanding of the data scheme is faulty (it is, indeed, a bit
odd), or alternatively it should do no escaping on schemes that it 
doesn't know.

In any case, thanks again for a (otherwise :>) marvelous piece of
software!

Bruce





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]