AW: [xml] xmllint - Newbie THINKS there may be a whitespace error in2.6.23
- From: "Buchcik, Kasimier" <k buchcik 4commerce de>
- To: "John Navratil" <jnavratil houston rr com>
- Cc: xml gnome org
- Subject: AW: [xml] xmllint - Newbie THINKS there may be a whitespace error in2.6.23
- Date: Wed, 26 Apr 2006 11:20:34 +0200
Hi,
-----Ursprüngliche Nachricht-----
Von: xml-bounces gnome org [mailto:xml-bounces gnome org] Im
Auftrag von John Navratil
Gesendet: Dienstag, 25. April 2006 21:50
An: xml gnome org
Betreff: [xml] xmllint - Newbie THINKS there may be a
whitespace error in2.6.23
Greetings,
Using xmllint to validate a document thusly:
xmllint --schema test.xsd test.xml
with schema (test.xsd):
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="A">
<xs:annotation>
<xs:documentation>asdf</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element name="B">
<xs:complexType>
<xs:attribute name="ID" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
and document (test.xml):
<A>
<B ID="1">
</B>
</A>
I get the error:
test.xml:2: element B: Schemas validity error : Element 'B':
Character
content is not allowed, because the content type is empty.
I thought that --noblanks would strip the whitespace and
eliminate the
error, but find instead that I must modify the document to:
<A>
<B ID="1" />
</A>
Is this behavior correct? I observe it in 2.6.22 and 2.6.23
on Fedora Core 4 and 5.
Yes, this behaviour is correct: there must not be any character
content inside the element "B" and, as Daniel said, the --noblanks
option won't remove such whitespace-only text-nodes. --noblanks will
remove whitespace-only text-nodes when you have mixed content;
i.e., when an element has character content *and* element content.
That's why the whitespace after "<A>" and before "</A>" is removed
in Daniel's example:
"
paphio:~/XML -> xmllint --noblanks test.xml
<?xml version="1.0"?>
<A><B>
</B></A>
"
When there's no mixed content, any whitespace is considered
significant by the --noblanks option; I think, that this assumption
could be based on the understatement that noone writes...
<B>
</B>
... if he doesn't want those space characters. You can write instead:
<B/> or
<B></B> or
<B><!-- No.1 the larch --></B> or
<B><?slide No.1 the larch ?></B>
All four cases of the element "B" have no content from
the viewpoint of W3C XML Schema.
For easier reading of the XML document by humans, people start a new
line for every new tag and indent subsequent tags. So the reason, I
think, why there's such a thing as a --noblanks option at all, is
to accommodate this pretty-printing issue by removing such
whitespace-only text nodes, since they are most likely not intended
to be part of the data.
So this:
<A>
<B/>
</A>
will be stripped to:
<A><B/></A>
However, we have also the mechanism of xml:space which could be
used to exactly define what is to be stripped and what not.
So if we had an option like --noblanksall, which would remove
*all* whitespace-only text-nodes, then you could use xsl:space
to specify where whitespace should be preserved.
Example:
<A>
<B> </B>
<C xml:space="preserve"> <D> </D> </C>
</A>
this would be whitespace-stripped with a
--noblanksall option (this option does not exist) to:
<A><B/><C xml:space="preserve"> <D> </D> </C></A>
If I remove the required attribute ("ID") from the schema
and the document, this behavior is not observed.
Check again please; I cannot reproduce this here.
Regards,
Kasimier
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]