Re: [xml] XmlTextReader vs, <elem />



Hi Josef,

2010/12/22 Josef KokeÅ <j kokes apatykaservis cz>:
Hi!

I am having problem with XmlTextReader, if my XML file has elements in the
form of <name atr1="value1" atr2="value2" ... /> or even just <name />: I
just can't detect the end of such an element.

It doesn't have one; rather, there is no corresponding
XML_READER_TYPE_END_ELEMENT.

I am reading the XML file in a
loop, in which I call xmlTextReaderRead followed by xmlTextReaderNodeType,
and after that I process the node according to its type. My problem is that
I have two situations which are completely different semantically but get
the same result from XmlTextReader:

Situation A:
<root>
Â<elem1 />
Â<elem2>...</elem2>
</root

Situation B:
<root>
Â<elem1>
 Â<elem2>...</elem2>
Â</elem1>
</root

The output of xmlTextReaderNodeType for situation A:
- XML_READER_TYPE_ELEMENT (<root>)
- XML_READER_TYPE_SIGNIFICANT_WHITESPACE (end of line, indent)
- XML_READER_TYPE_ELEMENT (<elem1>)
- XML_READER_TYPE_SIGNIFICANT_WHITESPACE (end of line, indent)
- XML_READER_TYPE_ELEMENT (<elem2>)
- XML_READER_TYPE_SIGNIFICANT_WHITESPACE (end of line, indent)
- XML_READER_TYPE_END_ELEMENT (end of <elem2>)
- XML_READER_TYPE_SIGNIFICANT_WHITESPACE (end of line, indent)
- XML_READER_TYPE_END_ELEMENT (end of <root>)
- XML_READER_TYPE_SIGNIFICANT_WHITESPACE (end of line, indent)

The output of xmlTextReaderNodeType for situation B:
- XML_READER_TYPE_ELEMENT (<root>)
- XML_READER_TYPE_SIGNIFICANT_WHITESPACE (end of line, indent)
- XML_READER_TYPE_ELEMENT (<elem1>)
- XML_READER_TYPE_SIGNIFICANT_WHITESPACE (end of line, indent)
- XML_READER_TYPE_ELEMENT (<elem2>)
- XML_READER_TYPE_SIGNIFICANT_WHITESPACE (end of line, indent)
- XML_READER_TYPE_END_ELEMENT (end of <elem2>)
- XML_READER_TYPE_SIGNIFICANT_WHITESPACE (end of line, indent)
- XML_READER_TYPE_END_ELEMENT (end of <elem1>)
- XML_READER_TYPE_SIGNIFICANT_WHITESPACE (end of line, indent)
- XML_READER_TYPE_END_ELEMENT (end of <root>)
- XML_READER_TYPE_SIGNIFICANT_WHITESPACE (end of line, indent)

These two outputs are not the same. There is one more
XML_READER_TYPE_END_ELEMENT in the second case.


My problem is that in the first case I can't tell whether the last
XML_READER_TYPE_END_ELEMENT applies to <root> or to <elem1>.

xmlTextReaderConstName would tell you that, perhaps together with
xmlTextReaderDepth. But you probably need to keep track of what is
happening anyway, because of XML like this:

<root>
 <elem1>foo</elem1>
 <elem1>bar</elem1>
</root>

or

<root>
  <elem1>
    <elem1>   ... </elem1>
  </elem1>
</root>

(in the first case  the second endElement belongs to the second
startElement, whereas in the second case the first endElement belongs
to the second startElement)

I tried experimenting with xmlTextReaderHasValue and xmlTextReaderIsEmptyElement,

You need to use xmlTextReaderIsEmptyElement when the type is
XML_READER_TYPE_ELEMENT.
If you are seeing a XML_READER_TYPE_END_ELEMENT, then that is the end
tag of a non-empty element so xmlTextReaderIsEmptyElement would always
be false.

but neither seems to produce usable results in these situations. The only
solution seems to be using xmlTextReaderDepth, but that does seem like a
rather dirty way of detecting end of element. Is there any particular reason
why won't libxml2 send me XML_READER_TYPE_END_ELEMENT when it encounters the
/> part of <elem />?

Because there is no end element which begins with "</"


Hope this helps,
Csaba
-- 
GCS a+ e++ d- C++ ULS$ L+$ !E- W++ P+++$ w++$ tv+ b++ DI D++ 5++
Life is complex, with real and imaginary parts.
"Ok, it boots. Which means it must be bug-free and perfect. " -- Linus Torvalds
"People disagree with me. I just ignore them." -- Linus Torvalds



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]