XML for SQL queries (plus DTD)


Here is my proposal to use XML to manipulate queries (instead of SQL which is
not portable across multiple DBMS). There is not yet any mention of LDAP server
because I still have not had time to figure out how it works.

Tell me what you think of it.





(                       )
( What it is for        )

The XML queries will made it possible for the client to send queries in a
generic form, without having to know if the underlying DBMS understands SQL92,
or other languages and extensions. 

The XML query will contain an understandable version of
what the client wants to do (an XML version of sentenses like "put some values
in that table", "create a new table", "set the reading permission for this user
on that table", "give me the list of shoes for which the sale price is higher
than the average sale price of the blue shoes",...)

When the client wants to send a query, it will have to build the query, then
get a string version of it (which can be saved, etc), send it to the provider,
and then discard it. There will be a Gda_Xmlquery object and a lib to help in
the job of creating (,...) the query.

Each provider will then have to get the XML string of the query, build back the
corresponding Gda_Xmlquery object, convert it to an understandable query for
the RDBMS it represents, send the query to the RDBMS, discard the Gda_Xmlquery
object and return the result of the query to the client.

The advantages of such a way of proceding are:
* no need to worry about the language specificities of the underlying RDBMS (no
more question like "do I need to format a date like MM-DD-YYYY or DD/MM/YY")
* it will be possible to save and restore in an easy way the Gda_Xmlquery
* I hope that in the future, the client will be able to ask the provider to
give him the XML Queries to save the structure and the data of a database (in
the way gda-xml lib does now, but with more accuracy).
* It will be possible to ask the provider for an equivalent, in the language
its RDBMS use, of the XML query to give some feedback to the user of the 
* if some RDBMS have some extensions, it will be possible to test if the
provider the client is connected to supports theses extensions (with the
supports() function, like for example nested select queries), and then create
the XML query accordingly.

(                       )
( Detailled description )

This query specification allows to ask for only the SELECT, DELETE, INSERT and
UPDATE operations for now, but has the possibility to be extended for CREATE

See at the end for a DTD proposal for validating the XML files following this 

A query is specified under the <QUERY> tag, and contains several subtrees:
* a TARGET TREE, tagged <TARGET>, that lists the tables or views that are 
  modified (or created) by the query (the user need to have a write permission
  to avery of these listed elements). Not all queries need to have that tree.
* a SOURCE TREE, tagged <SOURCES>, that lists the affected tables by the query,
  the ones where the query will have to look to get information (the user need
  to have a read access to ALL these tables). Not all queries need to have
  that tree.
* a VALUES TREE, tagged <VALUES>, which lists all the detailled parts of the
  query. For a SELECT query, these are the elements found after the SELECT. For
  an INSERT or UPDATE query, these are the elements specifying what gets 
  inserted or updated and their respective values.
* a QUAL TREE, tagged <QUALIFICATION>, which is the list of rules to qualify
  an entry in a table to submit it or not to the query (usually what appears
  after WHERE or HAVING clauses)

The <QUERY> tag has a mandatory property which is the kind of operation it
represents. It is the "op" property and can be either "SELECT", "INSERT",

Here is a detailled description of each tag and the possible included sub tags:

The <TARGET> and <SOURCES> trees are fairly simple.
That subtree can contain a table or view name. If a table 
is in the list, it can have an optional property named "temp" which can be set
to "yes" if the table will be created by the query and needs not to exist when
the session is closed (this is an optionnal feature).

For example:
   <TABLE temp="yes">mytable</TABLE>

That subtree can contain the same information as the TARGET one except for some
optionnal values. the <TABLE> and <VIEW> tags can have one optionnal property
named "alias", and one optionnal property named "id" (which must be unique to 
all such IDs) in the file. 

For example:
   <TABLE id="table01">mytable</TABLE>
   <VIEW alias="v">myview</VIEW>

This tree is a bit more complex. It can contain 4 types of tags which are
explained below:
** the <CONST> tag is used to specify a constant value and can have a property
   named "printname". For example:
   <CONST printname="Length">150</CONST>

** the <QUERY> tag is there to specify a SELECT query (can be usefull in the
   INSERT operation to insert into a table the result of another query)

** the <FIELD> tag will be the most often used, it specifies a table or view's
   field (and can give a value to it, depending on the kind of query, like in
   INSERT queries). The field needs to refer to a table or a view (which is
   done by setting to the table or view's ID the mandatory attribute "source".
   The name of the field is given with the mandatory property "name".
   There are other optionnal properties which are "printname", and "group" 
   (set to "yes" if the query will have to use this field to create groups, the
   GROUP BY clause)
   Here is an example, 150 is the value of the "myfield" field:
   <FIELD source="table01" name="myfield" printname="My Label">

** the <FUNC> tag is the most complex one. It allows to use functions, and is
   recursive (can contain other <FUNC>, <CONST> or <FIELD> tags) for complex
   functions. It has a mandatory property "name" which is the name of the
   function to call and an optionnal "printname" property.
   Here is an example:
   <FUNC printname="My Function Result" name="myfunc">
      <FIELD source="table01" name=myfield"/>
   will result in: myfunc(mytable.myfield, 150)

This subtree contains all qualification information. It is what we can find
just after the WHERE or HAVING clauses. The distinction between WHERE and
HAVING clauses will be made by the provider part.

Inside this subtree there can be the following subtrees:
** the <AND>, <OR>, <NOT>, etc to logically articulate the conditions.
   If two conditions are found without one of these tag to articulate them,
   then an AND will be assumed.
** the <EQUAL>, <INF>, <INFEQ>, <SUP>, <NONQUAL>, <LIKE>, <CONTAINS>, etc will
   contain the actual qualification clauses.

Here is a small example to figure how it works. Suppoe we want to state the
following conditions: 
"WHERE size>(SELECT AVG(size) from shoes) AND (color='blue' OR color='red')"
it would translate into:
      <FIELD source="t01" name="size"/>
      <QUERY op="SELECT">
	    <TABLE id="t01">shoes</TABLE>
	    <FUNC name="avg">
	       <FIELD source="t01" name="size"/>
         <FIELD source="t01" name="color"/>
         <FIELD source="t01" name="color"/>
It seems to be complicated, but it is not really, once you understand the

(     )
( DTD )





          id ID #IMPLIED
          temp(yes|no) #IMPLIED
          alias CDATA  #IMPLIED>

          id ID #IMPLIED
          alias CDATA #IMPLIED>

          printname #IMPLIED>

          source IDREF #REQUIRED
          name CDATA #REQUIRED
          printname #IMPLIED
          group(yes|no) #IMPLIED>

          name CDATA #REQUIRED
          printname #IMPLIED>


(                        )
( Implementation details )


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]