[xml] XPath extension API



Hi Daniel, all,

You (Daniel) said a few weeks ago libxml lacks a better XPath extension
API. Here are some suggestions.

 · add an xmlXPathWrapString() function

 · add convenient type-casting functions (essentially done):
     Y_t xmlXPathCastXToY (X_t);
   where Y (Y_t) is Boolean (int), Number (double) or String (xmlChar*)
   and X (X_t) Boolean (int), Number (double), String (xmlChar*),
   Node (xmlNodePtr) or NodeSet (xmlXPathNodeSetPtr)
   xmlXPathCastNodeToBoolean doesn't exists since XPath doesn't define
   such a conversion, maybe we should consider it as the boolean value
   of the string value of the node.

     Y_t xmlXPathCastToY_t (xmlXPathObjectPtr obj);
   where Y (Y_t) is one of Boolean (int), Number (double) or String
   (xmlChar*)

   The following point implies a modification of the existing API.
   It's not needed, only better IMO.
   I'm not convinced xmlXPathConvertX should free the object. If the
   goal is converting the object in place, passing an pointer to a
   void function is better IMO:
     void xmlXPathConvertBoolean (xmlXPathObjectPtr *obj);
     void xmlXPathConvertNumber  (xmlXPathObjectPtr *obj);
     void xmlXPathConvertString  (xmlXPathObjectPtr *obj);
   Calls like «cur = xmlXPathConvertString(cur)» would become
   «xmlXPathConvertString(&cur);»
   Functions similar to existing ones but notfreeing the object should
   then exist.

 · add functions for retrieving type of stacked object (for
   conditionnal processing):
     xmlXPathObjectType
               xmlXPathStackType  (xmlXPathParserContextPtr ctxt);

 · add functions for popping values of desired type:
     int       xmlXPathPopBoolean (xmlXPathParserContextPtr ctxt);
     double    xmlXPathPopNumber  (xmlXPathParserContextPtr ctxt);
     xmlChar * xmlXPathPopString  (xmlXPathParserContextPtr ctxt);
     xmlXPathNodeSetPtr
               xmlXPathPopNodeSet (xmlXPathParserContextPtr ctxt);
     void *    xmlXPathPopExternal(xmlXPathParserContextPtr ctxt);

 · add functions for pushing values of desired type:
     void xmlXPathReturnBoolean (xmlXPathParserContextPtr ctxt,
                                 int val);
     void xmlXPathReturnNumber  (xmlXPathParserContextPtr ctxt,
                                 double val);
     void xmlXPathReturnString  (xmlXPathParserContextPtr ctxt,
                                 xmlChar * val);
     void xmlXPathReturnNodeSet (xmlXPathParserContextPtr ctxt,
                                 xmlXPathNodeSetPtr ns);
     void xmlXPathReturnExternal(xmlXPathParserContextPtr ctxt,
                                 void * val);
   the last three functions wrap the values into objects

 · add functions for retrieving the context node and the document
   (and eventually other context info like proximity position and
   context size):
     xmlNodePtr xmlXPathGetContextNode (xmlXPathParserContextPtr ctxt);
     xmlDocPtr  xmlXPathGetDocument (xmlXPathParserContextPtr ctxt);

 · add functions for error raising:
     void xmlXPathSetError (xmlXPathParserContextPtr ctxt,
                            xmlXPathError err);
   calls to XP_ERROR(err) would become:
     xmlXPathSetError(ctxt, err);
     return;
   This function may be designed to accept an error message (calling
   xmlGenericError in the back)

   Eventually add convenient functions for specific errors:
     void xmlXPathSetArityError (xmlXPathParserContextPtr ctxt);
     void xmlXPathSetTypeError  (xmlXPathParserContextPtr ctxt);

   Convenient macros like CHECK_ARITY shouldn't be public, IMO.

   There is no need IMO passing __FILE__ and __LINE__ to
   xmlXPatherror. The name of the function is better and should be
   done by the user (extension implementor).

 · A tricky point, maybe not needed: allow extensions to have their
   own context data.

Of course, some of these functions can in fact be macros.

The idea is to be independant from the implementation (this will allow us
to change it (better separation between ParserContext and Context) without
breaking source and maybe even binary compatibility) and to avoid as many
as possible juggling with xmlXPathObject.
Built-in functions could still use bare access to the context and stack for
performance reasons if and when needed (as in xmlXPathAddValues and similar
functions).

For performance issues at xmlXPathObject allocation/deallocation level, a
solution could be to manage a stack of "empty" objects. When starting
evaluation, empty objects are allocated in this stack and xmlXPathNew*
functions just pick up one of them and fill it, xmlXPathFreeObject "purges"
the object and pushes it back to the stack.
Something similar to GMemChunks from the GLib
<http://developer.gnome.org/doc/API/glib/glib-memory-chunks.html>

Very soon, I'll finish implementing the second point (except the last
"sub-point" which implies modifying the current behaviour of some public
functions) (in fact this is done) and refactoring the module to improve
performance using these new functions.

Tom.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]