[xml] Xpath issues with libxml2

From: Alex Boese <alexanderashleyboese gmail com>
To: "xml gnome org" <xml gnome org>
Subject: [xml] Xpath issues with libxml2
Date: Thu, 12 Feb 2015 11:36:12 -0500

I kinda solved my own problem. I'm posting an example of what works, since I had difficulty establishing this:

import libxml2
import sys

XML1 = """<f:Foo xmlns:f="http://www.w3.org/f#"><b:Bar
xmlns:b="http://www.w3c.org/b#">foobar</b:Bar></f:Foo>"""

xml_parser_options = libxml2.XML_PARSE_RECOVER + libxml2.XML_PARSE_NONET
size = sys.getsizeof(XML1)
doc = libxml2.readMemory(XML1,size,None,'UTF-8',xml_parser_options)

context = doc.xpathNewContext()
#context.xpathRegsterNs('f',' http://www.w3.org/f#')
#context.xpathRegsterNs('b',' http://www.w3.org/b#')

print "test 1"
res = context.xpathEval('/*[local-name()="Foo"]/*[local-name()="Bar"]')
i = 0
for node in res:
    print i,':',node
    i = i + 1

#print "test 2"
#res = context.xpathEval('/f:Foo/b:Bar')
#i = 0
#for node in res:
#     print i,':',node
#     i = i + 1

Result:
Test 1
0 : <b:Bar xmlns:b="http://www.w3c.org/b#">foobar</b:Bar>

Of course, if you comment in the commented out portions you get test 2 which
returns the exact same thing. The point I'm making above is that it might not even be a requirement to register each namespace! It depends really on what kind of xpath you're using.

Of course, feel free to bash my answer and tell me I'm wrong.

-A

I'm thinking it should look something like this, assuming Python language:

Import xmlsec
Import libxml2

...

  def test_get_xml_fragment(self,xpath,ns):
       ret = None
       context = self.doc.xpathNewContext()
       if ns is None:

context.xpathRegisterNs(ns,'http://127.0.0.1/#no_place_like_home')
       res = context.xpathEval(xpath)
       i = 0
       for node in res:
            print i,':',node
            i = i + 1

Please note there is an object in the mix, and I can get the whole to work great without namespaces. One good example of how Python should be handling namespaces in this case would be great. Even telling me it's forever broken...this too would be good to know. (Also, this is not an ideal example...I get this. I want something that hobbles before I get something hat soars.)

Thanks in advance.

Sent from my Planet

Message: 2
Date: Fri, 30 Jan 2015 09:03:40 -0600
From: Ross Reedstrom <reedstrm rice edu>
To: xml gnome org
Subject: Re: [xml] Xpath issues with libxml2
Message-ID: <20150130150340 GA28262 rice edu>
Content-Type: text/plain; charset=us-ascii

Alex -
With out examples of what you've tried, it's hard to diagnose the problem.
However, seeing 'namespaces' and 'never returns anything' makes me think you're
having issues with the default namespace concept. While XML documents have a
default namespace, XPaths do not. Once you use namespaces in an xml document,
all your xpaths will need to use namespace declarations for all the path parts,
even for tags that are defaulted in the document. So, you'll need to declare
a namespace prefix that matches the default namespace in the doc. Give us
a small example that you've tried that doesn't work, we'll fix it.

Ross

On Fri, Jan 30, 2015 at 08:44:15AM -0500, Alex Boese wrote:
Forgive me if this is a deprecated approach (as I am not fully aware), but I was utilizing "default" libxml2 bindings (not lxml) in Python to retrieve xml fragments via xpath functions. Normally this seems to work fine, but with namespace declarations this seems especially problematic as nothing ever returns and nothing errors. Would it be possible to confirm what correct functions and order of operations would be for this? Even if the example is C, I can translate that to Python.

--
Ross Reedstrom, Ph.D.                                 reedstrm rice edu
Systems Engineer & Admin, Research Scientist        phone: 713-348-6166
Connexions                  http://cnx.org            fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE

------------------------------

Subject: Digest Footer

_______________________________________________
xml mailing list
xml gnome org
https://mail.gnome.org/mailman/listinfo/xml

------------------------------

End of xml Digest, Vol 128, Issue 2
***********************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.gnome.org/archives/xml/attachments/20150210/13669f43/attachment.html>

------------------------------

Subject: Digest Footer

_______________________________________________
xml mailing list
xml gnome org
https://mail.gnome.org/mailman/listinfo/xml

------------------------------

End of xml Digest, Vol 129, Issue 1
***********************************

Follow-Ups:
- Re: [xml] Xpath issues with libxml2
  - From: Daniel Veillard

References:
- Re: [xml] xml Digest, Vol 129, Issue 1
  - From: Alex Boese

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]