[xml] Get element tail
- From: Bogdan Cristea <cristeab gmail com>
- To: libXml <xml gnome org>
- Subject: [xml] Get element tail
- Date: Wed, 23 Oct 2013 12:12:11 +0200
Hi
I am trying to follow lxml from Python that allows to get the text after 
the end of an element, but before the next element begins (i.e. the next 
sibling of the current element). I am able to do this with 
xmlTextReader, by obtaining a pointer from the current node (when the 
node type is ELEMENT) to its next sibling. However, this approach does 
not work all the times:
<h1>Text before <strong>bold 1 <underline>undelined text</underline> 
after bold 1</strong>in between <strong>bold 2</strong>text after 
<strong>bold 3</strong>.</h1>
<h1><strong>bold 1</strong> no text before <strong>bold 2</strong> text 
after <strong>bold 3</strong>.</h1>
The first <h1> element is correctly parsed, but the second one is not, 
the text node " no text before " is not detected as the tail of the 
element <strong>. lxml however works correctly, this is the way actually 
I am validating my XML parser. I am a little bit puzzled by this result 
since lxml is an API for libxml2, however I am not sure if lxml 
implementation uses just xmlTextReader parser or buids the entire DOM 
tree. Is there a way to get the tail of an element with xmlTextReader ?
thanks
Bogdan
[
Date Prev][
Date Next]   [
Thread Prev][
Thread Next]   
[
Thread Index]
[
Date Index]
[
Author Index]