Re: [xml] Strange results of xpath element search using string-value



On Sat, Sep 30, 2017 at 08:37:35PM +0300, Алексей Алексей wrote:
   Hi,

   Can't understand the results of the following xpath queries:

   from lxml import etree
   from io import StringIO
   s = '<aaa><bbb>f<ccc>e</ccc>d</bbb></aaa>'
   tree = etree.parse(StringIO(s))
   print(tree.xpath('//bbb[.="fed"]')) #Returns nothing
   print(tree.xpath('//bbb[contains(.,"fed")]')) #Returns bar
   print(tree.xpath('//bbb[normalize-space(.)="fed"]')) #Returns bar
   print(tree.xpath('//bbb[string-length(.)=3]')) #Returns bar

   The first query doesn't find bar element by its string-value while the
   other three surprisingly do. Suppose that it is a bug.

  One thing to note: the first one compares a *nodeset* using equality
against a string. The other 3 compare strings with strings for equality.
The definitions are there:

  https://www.w3.org/TR/xpath/#booleans

 "If one object to be compared is a node-set and the other is a string,
  then the comparison will be true if and only if there is a node in
  the node-set such that the result of performing the comparison on the
  string-value of the node and the other string is true."


I also assume when you say "Returns bar" you really mean "Returns bbb"

thinkpad2:~/XML -> cat tst.xml
<aaa><bbb>f<ccc>e</ccc>d</bbb></aaa>
thinkpad2:~/XML -> xmllint --shell tst.xml
/ > xpath //bbb[.="fed"]
Object is a Node Set :
Set contains 0 nodes:
/ > xpath //bbb[contains(.,"fed")]
Object is a Node Set :
Set contains 1 nodes:
1  ELEMENT bbb
/ > xpath //bbb="fed"
Object is a Boolean : false
/ > xpath string(//bbb)
Object is a string : fed
/ > xpath //bbb
Object is a Node Set :
Set contains 1 nodes:
1  ELEMENT bbb
/ >

   hum ... XPath computes the string value correctly. there is something
going on there.

Daniel

   I have already asked about the issue in the lxml mailing list and they
   suggested to ask here because the XPath implementation is in libxml2,
   not in lxml.

   The versions used:
   lxml.etree:        (4, 0, 0, 0)
   libxml used:       (2, 9, 5)
   libxml compiled:   (2, 9, 5)
   libxslt used:      (1, 1, 30)
   libxslt compiled:  (1, 1, 30)

   Reproduction of the bug using xmllint instead of lxml:
   $ echo '<aaa><bbb>f<ccc>e</ccc>d</bbb></aaa>' | \
           xmllint --xpath '//bbb[. = "fed"]' -
   XPath set is empty

   Thanks in advance for any help!

   Aleksei

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
https://mail.gnome.org/mailman/listinfo/xml


-- 
Daniel Veillard      | Red Hat Developers Tools http://developer.redhat.com/
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]