Re: [xslt] uri escaping in html output, redux



On Tue, Jan 04, 2005 at 12:16:09PM +0200, Ivan Kurmanov wrote:
> Hi Sirs,
> 
> Before I start, let me thank you all for a great software
> you do and wish you all a good new year.
> 
> 
> I found at least two discussions of this issue in the past:
> 
> Libxslt escaping urls when outputting HTML by Andy Hird
> http://mail.gnome.org/archives/xslt/2002-September/msg00006.html
> 
> Output escaping (again!) by Bruce Miller
> http://mail.gnome.org/archives/xslt/2003-March/msg00020.html
> 
> (These are the initial messages in respective threads.)
> 
> 
> I'm writing this to confirm that:
> 
>   - developers understand that current libxslt behaviour in
>     this regard violates RFC 2396 "Uniform Resource
>     Identifiers (URI): Generic Syntax";
> 
>   - for compatibility, this is not going to be fixed.
> 
> Is this correct?

  Not 100% clear to me, the whole URI handling framework is messy
it's not just libxml2 or libxslt not being compliant to such and such
specification.
  This could be fixed if libxml2 behaviour is not changed.

> Stylesheet author only knows when and what part of URI to
> escape.

  and so far nobody did it right because they have no tools for it,
there is no way to put knowledge of all URI schemas in the tools library
to know what should be escaped and where, nor XSLT-1.0 provide any
URI parsing API facilities making generic handling by stylesheet author
impossible. if the author build the URI with knowledge of the given
protocol, then he may be able to do the right thing, but in general it
is not the case, it is impossible in general for example to know in advance
at the stylesheet level the base URI of the generated pages, so you
don't even know when writing the stylesheet if for example the HTML
resources produced will be accessed with a file:// or a http:// URL,
and you shouldn't know it. 

> My particular problem is that I can't make libxslt generate
> <a href=""> construct with "+" character in the href
> attribute value.  I need it for the query part of the URI:
> href="http://some.site/script?argument+another"; and I get
> href="http://some.site/script?argument%2Banother"; instead,
> which is a different URL.
> 
> If there is any kind of workaround to this problem, I'll be
> happy to know about it.  Thank you.

  There is now a bit of framework which could lead to a solution.
Make libxslt use the new xmlSavexxx APIs from <libxml/xmlsave.h>
and add an option about HTML URI escaping change of behaviour. It
could be possible to get more in-line with what the XSLT spec suggest
without breaking the existing behaviour of libxml2.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]