[xml] let's push down the bug: xmlBuildURI



On Fri, Dec 13, 2002 at 04:44:06AM -0500, Daniel Veillard wrote:
I hope that my analysis is correct ...:

Yes and no, the analysis was correct but the hearth of the bug was
(also) elsewhere.

The observation that unescaping without keeping the original URI is
lossy does apply also to xmlBuildURI, which is responsible to build a
'final' URI from an URI which has also a base URI (BTW this is the case
that always applies when using XSLT's document function with only one
argument).

Well, I expect the URI built with xmlBuildURI from an absolute correct
URI to be the exactly the same absolute correct URI. Unfortunately, this
is not the case, e.g.:

      uri                             xmlBuildURI(uri)

 http://foo.bar/cgi?a=b%3Dc        http://foo.bar/cgi?a=b=c

The reason is conceptually the same we have seen with nanohttp: uri
passes through xmlURIUnescapeString in functions xmlParseURI* (uri.c)
loosing all information about what was escaped and what was not. Then
when you are rebuilding the uri in xmlSaveUri you are unable to
disinguish between the first '=' which is the name-value separator and
the second '=' which is part of the value (please note that you can't
also try a reparsing of the URI because 'a=b=c' is ambigous input: it
can be 'a%3Db=c' or 'a=b%3Dc').

This bug is really more serious than the other one because it applies to
xmlBuildURI which is used widely in the code.

The only way to solve the problem here is really to keep somewhere
information about what was escaped or the original uri itself, but this
requires API changes, at least in xmlParseURI* and xmlURIUNescapeString
functions.

May I procede in try a fix even if it implies an API change?

TIA,
Cheers.

-- 
Stefano Zacchiroli  -  Undergraduate Student of CS @ Uni. Bologna, Italy
 zack {cs unibo it,debian.org,bononia.it} - http://www.bononia.it/zack/
 "I know you believe you understood what you think I said, but I am not
 sure you realize that what you heard is not what I meant!" -- G.Romney



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]