[xml] Get element tail

From: Bogdan Cristea <cristeab gmail com>
To: libXml <xml gnome org>
Subject: [xml] Get element tail
Date: Wed, 23 Oct 2013 12:12:11 +0200

Hi

I am trying to follow lxml from Python that allows to get the text afterthe end of an element, but before the next element begins (i.e. the nextsibling of the current element). I am able to do this withxmlTextReader, by obtaining a pointer from the current node (when thenode type is ELEMENT) to its next sibling. However, this approach doesnot work all the times:

<h1>Text before <strong>bold 1 <underline>undelined text</underline>after bold 1</strong>in between <strong>bold 2</strong>text after<strong>bold 3</strong>.</h1><h1><strong>bold 1</strong> no text before <strong>bold 2</strong> textafter <strong>bold 3</strong>.</h1>

The first <h1> element is correctly parsed, but the second one is not,the text node " no text before " is not detected as the tail of theelement <strong>. lxml however works correctly, this is the way actuallyI am validating my XML parser. I am a little bit puzzled by this resultsince lxml is an API for libxml2, however I am not sure if lxmlimplementation uses just xmlTextReader parser or buids the entire DOMtree. Is there a way to get the tail of an element with xmlTextReader ?


thanks
Bogdan

Follow-Ups:
- Re: [xml] Get element tail
  - From: Csaba Raduly

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]