DV said I should email the list, so here goes. I'm trying to create and register a xpath function re_contains that works the same way as the normal contains function except that it accepts a regular expression as its second argument. I have two problems, one with function arguments, another with return values. Here's the code: #!/usr/bin/python import libxml2 import sys import re def re_contains(context, s, p): print "s:", s, ",", len(s), ", p:", p for ss in s: print "ss: ", ss print dir(ss) if re.search(p, s): return 1 return 0 def find_matches(pattern, files): matches = [] for f in files: doc = libxml2.parseFile(f) ctxt = doc.xpathNewContext() libxml2.registerXPathFunction(ctxt._o, "re_contains", None, re_contains) res = ctxt.xpathEval(pattern) if res: matches.append((f, res)) return matches if __name__ == '__main__': pattern = sys.argv[1] files = sys.argv[2:] matches = find_matches(pattern, files) for file, nodes in matches: print "---", file for node in nodes: print node.serialize() print "--" The script works a bit like grep: it accepts as its first argument an xpath expression, and after that a list of files. It prints out the matching parts of the files. When I try to invoke it with an xpath expression like //foo/bar [re_contains(.,'as?df')], to search the contents of element bar, the value assigned to s in re_contents is a PyCObject that looks like a list with one item. The item is another PyCObject; taking dir() of it returns an empty list. $cat test.xml <foo><bar>baz</bar></foo> $./xpathgrep.py "//bar[re_contains(.,'ba')]" test.xml s: [<PyCObject object at 0x401a74d0>] , 1 , p: ba ss: <PyCObject object at 0x401a74d0> [] /usr/lib/python2.3/site-packages/libxml2.py:511: RuntimeWarning: tp_compare didn't return -1 or -2 for exception if type(o) == type([]) or type(o) == type(()): Traceback (most recent call last): [... snip an exception from re] Using the xpath function name() instead of . works out better: $./xpathgrep.py "//foo[re_contains(name(),'ba')]" test.xml s: foo , 3 , p: ba ss: f [... snip iterating f, o and o ] So should I do something magic when the user has passed in .? Or is this a bug? Using name() shows the second problem: what to return? True and False aren't the answer, apparently, because it says Unable to convert Python Object to XPath. The same with 1 and 0. I see contains calls a function called valuePush to store the value, but I don't think that's available in Python. Apparently the Python bindings call a function called libxml_xmlXPathObjectPtrConvert to convert the return value to something that can be used as an argument to valuePush, but I can't see anything that would indicate it could deal with boolean values. This is libxml2 2.6.11. -- [ Juri Pakaste | juri iki fi | http://www.iki.fi/juri/ ]
Attachment:
signature.asc
Description: This is a digitally signed message part