from lxml import etree
from io import StringIO
s = '<aaa><bbb>f<ccc>e</ccc>d</bbb></aaa>'
tree = etree.parse(StringIO(s))
print(tree.xpath('//bbb[.="fed"]')) #Returns nothing
print(tree.xpath('//bbb[contains(.,"fed")]')) #Returns bar
print(tree.xpath('//bbb[normalize-space(.)="fed"]')) #Returns bar
print(tree.xpath('//bbb[string-length(.)=3]')) #Returns bar
The first query doesn't find bar element by its string-value while the other three surprisingly do. Suppose that it is a bug.
I have already asked about the issue in the lxml mailing list and they suggested to ask here because the XPath implementation is in libxml2, not in lxml.
The versions used:
lxml.etree: (4, 0, 0, 0)
libxml used: (2, 9, 5)
libxml compiled: (2, 9, 5)
libxslt used: (1, 1, 30)
libxslt compiled: (1, 1, 30)
Reproduction of the bug using xmllint instead of lxml:
$ echo '<aaa><bbb>f<ccc>e</ccc>d</bbb></aaa>' | \
xmllint --xpath '//bbb[. = "fed"]' -
XPath set is empty
Thanks in advance for any help!
Aleksei