Re: [xml] patch: Functions to parse and create URI query strings



 Hi Rich,

On Tue, Apr 24, 2007 at 07:03:59PM +0100, Richard W.M. Jones wrote:
Attached is a patch against libxml2 svn which provides functions for 
parsing up and creating query strings, like:

  <someurl>?field1=value1&field2=value2

into [(field1, value1), (field2, value2)].

The semantics of query strings don't seem to be very well defined. 
Where there could be ambiguities, I have looked at what Perl CGI.pm does 
and implemented that.

  Hum, that's a fairly big patch, it adds many APIs including a structure,
it's usually better to discuss such changes here first before doing the
development.

I've added a fairly comprehensive test suite for the code.  With the 
patch we pass the old and new tests.

  Right I appreciate the complete testing, that's very good !

The current uri->query field is always unescaped during parsing.  I have 
changed so it always stored in its raw form.  This because otherwise 
it's impossible to parse query strings such as: 
file:///tmp/test.html?test=%26&second=%26 which can be generated by web 
browsers.  If anyone was relying on the current semantics, then it seems 
to me that they cannot parse such query strings correctly.

  Aside from the number of new APIs, available there, that's my main
issue with the patch. You are changing the default behaviour of a
functionality exposed like forever.
  I guess I would really prefer an approach which hooked into the 
URI parsing itself and filled in an extra list of values (or rather
an array of xmlChar *, alternatively name and values) in the xmlURI
structure at the end. That would allow to keep the uri->query data
as it was, and still provide the functionalities you suggest, based
on a preparsed xmlURIPtr. This would also avoid adding an extra list
type. I'm not sure about the ignire flag in that list, what it is 
used for ?

I think I would seriously simplify the API that way:
  - if the equivalent of xmlURIQueryParse() fails the array
    is not initialized in the xmlURI
  - this mean we don't have to carry those error code in all subsequent
    routines
  - xmlURIQueryExists is dropped, you get one main entry point
    const char *xmlURIQueryGetValue(xmlURIPtr uri, const char *name);
    you can add xmlURIQuerySetValue() similary.
    and the serialization is an internal part of xmlSaveUri() which builds
    from the array if it exists, and use uri->query if not.
  - those simplified API would work immediately with the Python generator
    which would not find char ** which can't be handled automatically.
  - the APIs also would use only const char * since strings allocation
    would be bound to the xmlURI structure.

As you see for the same kind of services I would derives very different APIs
myself to stay in line with libxml2 existing practices. That's one of the
reasons I really prefer such work to be discussed here before jumping 
on generating code :-)

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]