[xml] newTextReader's GetParserLineNumber is too keen??



I'm following the example at http://xmlsoft.org/xmlreader.html, and
have modified only processNode() to print reader.GetParserLineNumber()
so that I have

---------------------------
import libxml2

def processNode(reader):
   print reader.GetParserLineNumber(), reader.Name()

def streamFile(filename):
   try:
       reader = libxml2.newTextReaderFilename(filename)
   except:
       print "unable to open %s" % (filename)
       return

   ret = reader.Read()
   while ret == 1:
       processNode(reader)
       ret = reader.Read()

   if ret != 0:
       print "%s : failed to parse" % (filename)
---------------------------

and my .xml file is

---------------------------
<doc><a/><b>some text</b>
<c/></doc>
---------------------------

But the print-out is
---------------------------
3 doc
3 a
3 b
3 #text
3 b
3 #text
3 c
3 doc
---------------------------

The problem is that every node is reported to be on line 3, even
though some nodes are clearly on line 1, and others on line 2 (and
there is no line 3).

With a slightly longer .xml file, it looks like it reads ahead in
chunks: for a 72 line .xml file, reader.GetParserLineNumber() returns
a bunch of 25s, then 49s, then finally 73s.

Is this a bug or am I doing something wrong?  And if this is a bug,
can I get some hints as to what parts of the code to look to patch?


Thanks,
Nigel.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]