Re: [xml] libxml .xsd validation problem



Hi Patrick,

On Tue, Sep 7, 2010 at 10:35 PM, Patrick McClory wrote:
Hello,

I'm working on a project which requires validation of xml documents against .xsd schemas.  We both create 
xml documents from scratch, and create xml docs from char * buffers read from a socket.  I've run into 
trouble validating the docs created from buffers, even when the buffers are generated from a document that 
already validated successfully.

For example I have the following schema in a file called "example.xsd":

<?xml version="1.0" encoding="UTF-8"?>
<!--
 A simple test schema
-->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema";
   targetNamespace="http://localhost/test_namespace";
   xmlns="http://localhost/test_namespace";>

   <xs:complexType name="testType">
     <xs:sequence minOccurs="1" maxOccurs="1">
       <xs:element name="element1" type="xs:string"/>
       <xs:element name="element2" type="xs:int"/>
     </xs:sequence>
   </xs:complexType>

   <xs:element name="testInstance" type="testType"/>
</xs:schema>

The following code creates an xml doc from scratch, which validates.  It then dumps that doc into a buffer, 
reads that buffer into a new doc, and tries to validate that, but the second validation fails:

(snip C code)
The output generated when this runs is:

generated doc is valid
element element1: Schemas validity error : Element '{http://localhost/test_namespace}element1': This 
element is not expected. Expected is ( element1 ).
doc from buffer is invalid

I dump both docs to output files (generated.xml and buffer.xml), and I confirmed that they're the same on 
disk using diff.

The problem seems to be that when libxml reads from the buffer it attaches the parent namespace to the 
children (if it isn't specified), which later causes validation to fail.  Is this a common problem?  Is 
there a standard workaround for this?


If you run "xmllint --schema example.xsd generated.xml", you'll get
the same error. Same with buffer.xml too.

In generated.xml, the default namespace of
"http://localhost/test_namespace"; applies to all elements, including
<element1> and <element2>. However, in the xsd the targetNamespace
applies only to the top-level <element name="testInstance"> (I think)

So either the in-memory validation is incorrect or the writing and
re-reading of the XML document is not a null operation.
xmlDocDumpFormatMemory may be at fault. To validate, element1 and
element2 would need a xmlns="" each.

The following validates with examle.xsd:

<testInstance xmlns="http://localhost/test_namespace";>
  <element1 xmlns="">foo</element1>
  <element2 xmlns="">1</element2>
</testInstance>

Running under the debugger:
(gdb) p *root_node
$1 = {_private = 0x0, type = XML_ELEMENT_NODE, name = 0xdca3e8
"testInstance", children = 0xdca468, last = 0xdca508, parent =
0xdca330, next = 0x0, prev = 0x0, doc = 0xdca330, ns = 0xdca400, ...}
(gdb) p *root_node->children
$2 = {_private = 0x0, type = XML_ELEMENT_NODE, name = 0xdca4a8
"element1", children = 0xdca4b8, last = 0xdca4b8, parent = 0xdca3a8,
next = 0xdca508, prev = 0x0, doc = 0xdca330, ns = 0xdca448,  ...}

Note the different ns for the root and the first child.

(gdb) p *new_doc ->children
$5 = {_private = 0x0, type = XML_ELEMENT_NODE, name = 0xdcaa2b
"testInstance", children = 0xdcaef0, last = 0xdcaf88, parent =
0xdcadf0, next = 0x0, prev = 0x0, doc = 0xdcadf0, ns = 0xdcaea8, ...}
(gdb) p *new_doc ->children->children
$6 = {_private = 0x0, type = XML_ELEMENT_NODE, name = 0xdcaa58
"element1", children = 0xdcaf30, last = 0xdcaf30, parent = 0xdcc9b8,
next = 0xdcaf88, prev = 0x0, doc = 0xdcadf0, ns = 0xdcaea8,  ...}

This time, the ns member has the same value for the root element and its child.


-- 
Life is complex, with real and imaginary parts.
"Ok, it boots. Which means it must be bug-free and perfect. " -- Linus Torvalds
"People disagree with me. I just ignore them." -- Linus Torvalds



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]