[libxslt.wiki] Create Writing extensions



commit 2b1233bb5d8cdb730e94d85038d0cee05f2ec1f9
Author: Nick Wellnhofer <wellnhofer aevum de>
Date:   Sat Feb 12 17:33:26 2022 +0000

    Create Writing extensions

 Writing-extensions.md | 319 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 319 insertions(+)
---
diff --git a/Writing-extensions.md b/Writing-extensions.md
new file mode 100644
index 0000000..bac50ee
--- /dev/null
+++ b/Writing-extensions.md
@@ -0,0 +1,319 @@
+### Introduction
+
+This document describes the work needed to write extensions to the standard XSLT library for use with 
[libxslt](http://xmlsoft.org/XSLT/), the [XSLT](http://www.w3.org/TR/xslt) C library developed for the 
[GNOME](http://www.gnome.org/) project.
+
+Before starting reading this document it is highly recommended to get familiar with [the libxslt 
internals](http://xmlsoft.org/XSLT/internals.html).
+
+Note: this documentation is by definition incomplete and I am not good at spelling, grammar, so patches and 
suggestions are [really welcome](mailto:veillard redhat com).
+
+### Basics
+
+The [XSLT specification](http://www.w3.org/TR/xslt) provides two [ways to extend an XSLT 
engine](http://www.w3.org/TR/xslt):
+
+* providing [new extension functions](http://www.w3.org/TR/xslt) which can be called from XPath expressions
+* providing [new extension elements](http://www.w3.org/TR/xslt) which can be inserted in stylesheets
+
+In both cases the extensions need to be associated to a new namespace, i.e. an URI used as the name for the 
extension's namespace (there is no need to have a resource there for this to work).
+
+libxslt provides a few extensions itself, either in the libxslt namespace 
"<http://xmlsoft.org/XSLT/namespace>" or in namespaces for other well known extensions provided by other XSLT 
processors like Saxon, Xalan or XT.
+
+### Extension modules
+
+Since extensions are bound to a namespace name, usually sets of extensions coming from a given source are 
using the same namespace name defining in practice a group of extensions providing elements, functions or 
both. From the libxslt point of view those are considered as an "extension module", and most of the APIs work 
at a module point of view.
+
+Registration of new functions or elements are bound to the activation of the module. This is currently done 
by declaring the namespace as an extension by using the attribute `extension-element-prefixes` on the 
[`xsl:stylesheet`](http://www.w3.org/TR/xslt) element.
+
+An extension module is defined by 3 objects:
+
+* the namespace name associated
+* an initialization function
+* a shutdown function
+
+### Registering a module
+
+Currently a libxslt module has to be compiled within the application using libxslt. There is no code to load 
dynamically shared libraries associated to a namespace (this may be added but is likely to become a 
portability nightmare).
+
+The current way to register a module is to link the code implementing it with the application and to call a 
registration function:
+
+```
+int xsltRegisterExtModule(const xmlChar *URI,
+                          xsltExtInitFunction initFunc,
+                          xsltExtShutdownFunction shutdownFunc);
+```
+
+The associated header is read by:
+
+```
+#include<libxslt/extensions.h>
+```
+
+which also defines the type for the initialization and shutdown functions
+
+### Loading a module
+
+Once the module URI has been registered and if the XSLT processor detects that a given stylesheet needs the 
functionalities of an extended module, this one is initialized.
+
+The xsltExtInitFunction type defines the interface for an initialization function:
+
+```
+/**
+ * xsltExtInitFunction:
+ * @ctxt:  an XSLT transformation context
+ * @URI:  the namespace URI for the extension
+ *
+ * A function called at initialization time of an XSLT
+ * extension module
+ *
+ * Returns a pointer to the module specific data for this
+ * transformation
+ */
+typedef void *(*xsltExtInitFunction)(xsltTransformContextPtr ctxt,
+                                     const xmlChar *URI);
+```
+
+There are 3 things to notice:
+
+* The function gets passed the namespace name URI as an argument. This allows a single function to provide 
the initialization for multiple logical modules.
+* It also gets passed a transformation context. The initialization is done at run time before any processing 
occurs on the stylesheet but it will be invoked separately each time for each transformation.
+* It returns a pointer. This can be used to store module specific information which can be retrieved later 
when a function or an element from the extension is used. An obvious example is a connection to a database 
which should be kept and reused along with the transformation. NULL is a perfectly valid return; there is no 
way to indicate a failure at this level
+
+What this function is expected to do is:
+
+* prepare the context for this module (like opening the database connection)
+* register the extensions specific to this module
+
+### Registering an extension function
+
+There is a single call to do this registration:
+
+```
+int xsltRegisterExtFunction(xsltTransformContextPtr ctxt,
+                            const xmlChar *name,
+                            const xmlChar *URI,
+                            xmlXPathEvalFunc function);
+```
+
+The registration is bound to a single transformation instance referred by ctxt, name is the UTF8 encoded 
name for the NCName of the function, and URI is the namespace name for the extension (no checking is done, a 
module could register functions or elements from a different namespace, but it is not recommended).
+
+### Implementing an extension function
+
+The implementation of the function must have the signature of a libxml XPath function:
+
+```
+/**
+ * xmlXPathEvalFunc:
+ * @ctxt: an XPath parser context
+ * @nargs: the number of arguments passed to the function
+ *
+ * an XPath evaluation function, the parameters are on the
+ * XPath context stack
+ */
+
+typedef void (*xmlXPathEvalFunc)(xmlXPathParserContextPtr ctxt,
+                                 int nargs);
+```
+
+The context passed to an XPath function is not an XSLT context but an [XPath 
context](http://xmlsoft.org/XSLT/internals.html#XPath1). However it is possible to find one from the other:
+
+* The function xsltXPathGetTransformContext provides this lookup facility:
+
+  ```
+  xsltTransformContextPtr
+           xsltXPathGetTransformContext
+                            (xmlXPathParserContextPtr ctxt);
+  ```
+* The `xmlXPathContextPtr` associated to an `xsltTransformContext` is stored in the `xpathCtxt` field.
+
+The first thing an extension function may want to do is to check the arguments passed on the stack, the 
`nargs` parameter will tell how many of them were provided on the XPath expression. The macro valuePop will 
extract them from the XPath stack:
+
+```
+#include <libxml/xpath.h>
+#include <libxml/xpathInternals.h>
+
+xmlXPathObjectPtr obj = valuePop(ctxt); 
+```
+
+Note that `ctxt` is the XPath context not the XSLT one. It is then possible to examine the content of the 
value. Check [the description of XPath objects](http://xmlsoft.org/XSLT/internals.html#Descriptio) if 
necessary. The following is a common sequence checking whether the argument passed is a string and converting 
it using the built-in XPath `string()` function if this is not the case:
+
+```
+if (obj->type != XPATH_STRING) {
+    valuePush(ctxt, obj);
+    xmlXPathStringFunction(ctxt, 1);
+    obj = valuePop(ctxt);
+}
+```
+
+Most common XPath functions are available directly at the C level and are exported either in 
`<libxml/xpath.h>` or in `<libxml/xpathInternals.h>`.
+
+The extension function may also need to retrieve the data associated to this module instance (the database 
connection in the previous example) this can be done using the xsltGetExtData:
+
+```
+void * xsltGetExtData(xsltTransformContextPtr ctxt,
+                      const xmlChar *URI);
+```
+
+Again the URI to be provided is the one which was used when registering the module.
+
+Once the function finishes, don't forget to:
+
+* push the return value on the stack using `valuePush(ctxt, obj)`
+* deallocate the parameters passed to the function using `xmlXPathFreeObject(obj)`
+
+### Examples for extension functions
+
+The module libxslt/functions.c contains the sources of the XSLT built-in functions, including document(), 
key(), generate-id(), etc. as well as a full example module at the end. Here is the test function 
implementation for the libxslt:test function:
+
+```
+/**
+ * xsltExtFunctionTest:
+ * @ctxt:  the XPath Parser context
+ * @nargs:  the number of arguments
+ *
+ * function libxslt:test() for testing the extensions support.
+ */
+static void
+xsltExtFunctionTest(xmlXPathParserContextPtr ctxt, int nargs)
+{
+    xsltTransformContextPtr tctxt;
+    void *data;
+
+    tctxt = xsltXPathGetTransformContext(ctxt);
+    if (tctxt == NULL) {
+        xsltGenericError(xsltGenericErrorContext,
+            "xsltExtFunctionTest: failed to get the transformation context\n");
+        return;
+    }
+    data = xsltGetExtData(tctxt, (const xmlChar *) XSLT_DEFAULT_URL);
+    if (data == NULL) {
+        xsltGenericError(xsltGenericErrorContext,
+            "xsltExtFunctionTest: failed to get module data\n");
+        return;
+    }
+#ifdef WITH_XSLT_DEBUG_FUNCTION
+    xsltGenericDebug(xsltGenericDebugContext,
+                     "libxslt:test() called with %d args\n", nargs);
+#endif
+}
+```
+
+### Registering an extension element
+
+There is a single call to do this registration:
+
+```
+int xsltRegisterExtElement(xsltTransformContextPtr ctxt,
+                           const xmlChar *name,
+                           const xmlChar *URI,
+                           xsltTransformFunction function);
+```
+
+It is similar to the mechanism used to register an extension function, except that the signature of an 
extension element implementation is different.
+
+The registration is bound to a single transformation instance referred to by ctxt, name is the UTF8 encoded 
name for the NCName of the element, and URI is the namespace name for the extension (no checking is done, a 
module could register elements for a different namespace, but it is not recommended).
+
+### Implementing an extension element
+
+The implementation of the element must have the signature of an XSLT transformation function:
+
+```
+/** 
+ * xsltTransformFunction: 
+ * @ctxt: the XSLT transformation context
+ * @node: the input node
+ * @inst: the stylesheet node 
+ * @comp: the compiled information from the stylesheet 
+ * 
+ * signature of the function associated to elements part of the
+ * stylesheet language like xsl:if or xsl:apply-templates.
+ */ 
+typedef void (*xsltTransformFunction)
+                          (xsltTransformContextPtr ctxt,
+                           xmlNodePtr node,
+                           xmlNodePtr inst,
+                           xsltStylePreCompPtr comp);
+```
+
+The first argument is the XSLT transformation context. The second and third arguments are xmlNodePtr i.e. 
internal memory [representation of XML nodes](http://xmlsoft.org/XSLT/internals.html#libxml). They are 
respectively `node` from the the input document being transformed by the stylesheet and `inst` the extension 
element in the stylesheet. The last argument is `comp` a pointer to a precompiled representation of `inst` 
but usually for an extension function this value is `NULL` by default (it could be added and associated to 
the instruction in `inst->_private`).
+
+The same functions are available from a function implementing an extension element as in an extension 
function, including `xsltGetExtData()`.
+
+The goal of an extension element being usually to enrich the generated output, it is expected that they will 
grow the currently generated output tree. This can be done by grabbing ctxt->insert which is the current 
libxml node being generated (Note this can also be the intermediate value tree being built for example to 
initialize a variable, the processing should be similar). The functions for libxml tree manipulation from 
[<libxml/tree.h>](http://xmlsoft.org/html/libxml-tree.html) can be employed to extend or modify the tree, but 
it is required to preserve the insertion node and its ancestors since there are existing pointers to those 
elements still in use in the XSLT template execution stack.
+
+### Example for extension elements
+
+The module libxslt/transform.c contains the sources of the XSLT built-in elements, including xsl:element, 
xsl:attribute, xsl:if, etc. There is a small but full example in functions.c providing the implementation for 
the libxslt:test element, it will output a comment in the result tree:
+
+```
+/**
+ * xsltExtElementTest:
+ * @ctxt:  an XSLT processing context
+ * @node:  The current node
+ * @inst:  the instruction in the stylesheet
+ * @comp:  precomputed information
+ *
+ * Process a libxslt:test node
+ */
+static void
+xsltExtElementTest(xsltTransformContextPtr ctxt, xmlNodePtr node,
+                   xmlNodePtr inst,
+                   xsltStylePreCompPtr comp)
+{
+    xmlNodePtr comment;
+
+    if (ctxt == NULL) {
+        xsltGenericError(xsltGenericErrorContext,
+                         "xsltExtElementTest: no transformation context\n");
+        return;
+    }
+    if (node == NULL) {
+        xsltGenericError(xsltGenericErrorContext,
+                         "xsltExtElementTest: no current node\n");
+        return;
+    }
+    if (inst == NULL) {
+        xsltGenericError(xsltGenericErrorContext,
+                         "xsltExtElementTest: no instruction\n");
+        return;
+    }
+    if (ctxt->insert == NULL) {
+        xsltGenericError(xsltGenericErrorContext,
+                         "xsltExtElementTest: no insertion point\n");
+        return;
+    }
+    comment =
+        xmlNewComment((const xmlChar *)
+                      "libxslt:test element test worked");
+    xmlAddChild(ctxt->insert, comment);
+}
+```
+
+### The shutdown of a module
+
+When the XSLT processor ends a transformation, the shutdown function (if it exists) for each of the modules 
initialized is called. The xsltExtShutdownFunction type defines the interface for a shutdown function:
+
+```
+/**
+ * xsltExtShutdownFunction:
+ * @ctxt:  an XSLT transformation context
+ * @URI:  the namespace URI for the extension
+ * @data:  the data associated to this module
+ *
+ * A function called at shutdown time of an XSLT extension module
+ */
+typedef void (*xsltExtShutdownFunction) (xsltTransformContextPtr ctxt,
+                                         const xmlChar *URI,
+                                         void *data);
+```
+
+This is really similar to a module initialization function except a third argument is passed, it's the value 
that was returned by the initialization function. This allows the routine to deallocate resources from the 
module for example close the connection to the database to keep the same example.
+
+### Future work
+
+Well, some of the pieces missing:
+
+* a way to load shared libraries to instantiate new modules
+* a better detection of extension functions usage and their registration without having to use the extension 
prefix which ought to be reserved to element extensions.
+* more examples
+* implementations of the [EXSLT](http://www.exslt.org/) common extension libraries, Thomas Broyer nearly 
finished implementing them.
+
+Daniel Veillard
\ No newline at end of file


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]