Re: [xml] XMLReader and distinguish between <a /> and <a></a>



On Sun, Aug 21, 2011 at 05:14:56PM +0200, Csaba Raduly wrote:
On Sun, Aug 21, 2011 at 10:58 AM, TomÃÅ PospÃÅil  wrote:
Hello LibXML hackers,

I'm using xmlReader for recursive pre order traversal throught XML tree. Everything works well, but 
during testing I faced problem with distinguish between

<doc>
 <e1  />
 <e1  ></e1>
<doc>

both e1 elements are on same level, so I don't know after call xmlTextReaderRead() if isn't XML well 
formed or I have new node on same level.

If xmlTextReaderRead returns 1, then the XML is well formed.


In short I have to know what type element is (short version <a />, or long <a> </a>), how can I 
accomplished that?

  Actually you should not. From an XML point of view the two are
strictly equivalent. When I implemented the reader I found out that
they distinguished the two and it made my life difficult, a parser
is not supposed to expose the difference.
  If you have to rely on this, there is something broken.

For the short version, you'll get a XMLREADER_TYPE_ELEMENT. If you
call xmlReaderIsEmptyElement(), it will return 1. There will be no
XMLREADER_TYPE_END_ELEMENT.
For the long version, you'll get a XMLREADER_TYPE_ELEMENT
(xmlReaderIsEmptyElement() will return 0) and later a
XMLREADER_TYPE_END_ELEMENT.

  Unless you use the reader on an existing preparsed tree and in which
case libxml2 discards the information and the difference can't be
provided.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel veillard com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]