Re: [xslt] output escaping (again!)



On Sun, Mar 23, 2003 at 03:47:06PM -0500, Bruce Miller wrote:
> Daniel Veillard wrote:
> >   Anyway the whole URI handling stinks, there is up to 3 layers of 
> > encoding/decoding needed, and in general the rules are not followed
> > in data or application. Big bad mess... Why do you use ',' and ':' 
> > unescaped in URI while it's clearly stated it's not a good idea ?
> 
> But no! It's not that it's a bad idea to use them in general --- in fact,
> it's mandatory! But it _is_ a bad idea to use them in a way that
> doesn't conform to the url scheme in question.
> >From RFC2396:
>    The "reserved" syntax class above refers to those characters that are
>    allowed within a URI, but which may not be allowed within a
>    particular component of the generic URI syntax; they are used as
>    delimiters of the components described in Section 3.
> 
> 
> Now, the data scheme may turn out to be a moot example since it doesn't
> seem to be as well supported as I'd like, but it's a simple one, so...
> (see ftp://ftp.isi.edu/in-notes/rfc2397.txt)
> 
> Consider the following URI:
>   (1)  data:text/plain,Hello%2C%20World
> This represents an `in-line' plain text document.
> If you click such an html link, like
>   <a href="data:text/plain,Hello%2C%20World">Hello, World</a>
> you should get a document, as plain text, containing only 
> the string "Hello, World". (Presumably it could also be used
> for inline images, etc)

  Never looked at rfc2397, that's scary ... smart but scary ...

> The following two similar URI's are _different_ (and maybe invalid):
>   (2)  data:text/plain%2CHello%2C%20World   (No <data> part)
>   (3)  data:text/plain,Hello, World     (space & 2nd comma not allowed)


  Spaces are not allowed, not negociable. Maybe in IRI though 
I do think it's a bad idea and I'm not the only one !

> So, it would appear that by preventing me from producing (3),
> libxml is forcing me to get (2)  ---- and I can't get (1)!!!

  the base64 mechanism of the data scheme is precisely defined
to allow you to pass those things which are not allowed by the 
URI syntax. I don't think your example is valid :-)

> >   Your specific problem should be fixed in the next release. For the
> > general case I don't think it's doable.
> 
> I agree!! Since libxml can't read my mind, I'm responsible for telling
> it what I want.  If libxml 
>   (a) _only_ converted non-ASCII and
>   (b) supported str:encode-uri, (or fn:escape-uri from XPath2.0) 
> it would be simple (but verbose!):

>   <a href="{concat('data:text/plain,', str:encode-uri(@data),true())}"> ...
> Then <foo data="Hello, World"/> would give me case (1).
> 
> Alas, str:encode-uri isn't implemented in libxml, nor anywhere else, according
> to http://www.exslt.org/.

 wrong it's available at the xsltproc level !

  paphio:~ -> xsltproc --dumpextensions | grep encode
  {http://exslt.org/strings}encode-uri

it's in libexslt.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]