[xml] Get element tail
- From: Bogdan Cristea <cristeab gmail com>
- To: libXml <xml gnome org>
- Subject: [xml] Get element tail
- Date: Wed, 23 Oct 2013 12:12:11 +0200
Hi
I am trying to follow lxml from Python that allows to get the text after
the end of an element, but before the next element begins (i.e. the next
sibling of the current element). I am able to do this with
xmlTextReader, by obtaining a pointer from the current node (when the
node type is ELEMENT) to its next sibling. However, this approach does
not work all the times:
<h1>Text before <strong>bold 1 <underline>undelined text</underline>
after bold 1</strong>in between <strong>bold 2</strong>text after
<strong>bold 3</strong>.</h1>
<h1><strong>bold 1</strong> no text before <strong>bold 2</strong> text
after <strong>bold 3</strong>.</h1>
The first <h1> element is correctly parsed, but the second one is not,
the text node " no text before " is not detected as the tail of the
element <strong>. lxml however works correctly, this is the way actually
I am validating my XML parser. I am a little bit puzzled by this result
since lxml is an API for libxml2, however I am not sure if lxml
implementation uses just xmlTextReader parser or buids the entire DOM
tree. Is there a way to get the tail of an element with xmlTextReader ?
thanks
Bogdan
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]