[xml] XPath's context doc


I'm currently evaluating the nature of the context doc
in LibXSLT and Libxml2's xpath.c module. In the specs
(XPath 1.0 and XSLT 1.0) the context doc is not mentioned,
since, I think, it is assumed to be available via the
context node. Libxml2 needs this extra information for
some internal reasons.

I'd be glad for comments/corrections on the following
observations. I'm aiming at getting rid of the need to keep
the context doc up-to-date and changing this to a state where
the context doc is only evaluated when needed.

The XPath machinery tries to keep this context doc up-to-date;
thus e.g. in the predicate-related code, it is updated for
every current context node:

  if ((contextNode->type != XML_NAMESPACE_DECL) &&
    (contextNode->doc != NULL))
    xpctxt->doc = contextNode->doc;

The context doc used in e.g. some axis-traversal functions,
for checking if a node is the document node (which could be also
done by checking the @type field).

The context doc is also used to select the document node:

  xmlXPathRoot(xmlXPathParserContextPtr ctxt) {
    if ((ctxt == NULL) || (ctxt->context == NULL))
    ctxt->context->node = (xmlNodePtr) ctxt->context->doc;
    valuePush(ctxt, xmlXPathCacheNewNodeSet(ctxt->context,

In the XPath module, there's lot of code storing and restoring
the context doc when entering various operation. I wonder if
this is needed.

Some questions:

In some axis-traversal functions (e.g. in xmlXPathNextParent())
there's code like this:

  if (ctxt->context->node->parent == NULL)
    return((xmlNodePtr) ctxt->context->doc);

In which cases is the @parent of a node expected to be NULL
(beside for the document node)?

Could code like the following
(e.g. in xmlXPathNextPrecedingSibling())...

  if (cur == (xmlNodePtr) ctxt->context->doc)

... be substituted for:

  if (cur == (xmlNodePtr) cur->doc)

This assumes that a check for XML_NAMESPACE_DECL is done
beforehand, since xmlNs has a different structure
(and no @doc field).

Could xmlXPathRoot() be changed to select the doc of the
context node (context->node->doc), and only if the context
node is NULL (maybe because it is an initial evaluation), 
use the context->doc?

A LibXSLT related reason for the context doc is the key()
function and xsl:key declaration. The keys are stored in
xsltDocument, which is linked to the relevant doc, either
stored directly in the doc->_private field, or linked via a map.
In order to compute and lookup the key values, the doc is obviously
needed; again the context doc in the XPath module is used for this.
For this reason also LibXSLT tries to keep the XPath context doc
up-to-date (see xsltForEach(), transform.c).

A scenario for this would be:
  <xsl:key name="my-key" match="foo" use="."/>

  <xsl:template match="/">
    <xsl:for-each select="(document('doc-1.xml')/* |
      <xsl:value-of select="key('my-key', 'bar')/@origin"/>

I changed the LibXSLT code on my side to avoid this all. The
relevant doc is queried based on the context node on demand in
the key() function. The regression tests run fine with this.
The only concern I have is that a node might have node->doc == NULL
when about to access the doc. 
The current code tries, for a given set of nodes, to adjust the context
doc for each of the nodes, and if one of those nodes has a @doc of NULL,
then the adjusting will be skipped. This means that, either the
context doc is not set at all (if all nodes have a doc of NULL), or
to an incorrect context doc - if a node from a different context has
a doc of NULL, and a previous node had a doc (hope this makes sense for
With the changed code, one can still run into the scenario,
where a doc is not available (an error needs to be raised then), but
it eliminates the scenario where an incorrect doc is used for the key



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]