[xml] help with xmlTextReaderRead (with DTD validation)



Although I have contributed to a few (free) scientific applications (in C and F77) for quite some time, I'm new to xml and libxml2 (this may explain the naiveté of my question):
 
I'm writing a C parser for an XML input (input only, no file editing or writing out) - so the xmlTextReader API (http://xmlsoft.org/xmlreader.html#L1142) seems very appropriate. I have gone through the tutorial and am now in the process of integrating DTD validation. The DTD validation correctly flags errors, but I can't seem to add appropriate code to halt the scanning process once a first parsing error occurs (the code below - lightly adapted from the tutorial - simply goes on and on, parsing the entire file despite the initial DTD parsing errors):
 
--- C code ---

1.  #include <stdlib.h>

2.  #include <stdio.h>

3.  #include <libxml/xmlreader.h>

4.  

5.  static void processNode( xmlTextReaderPtr reader )

6.  {

7.  xmlChar *name, *value;

8.  name = xmlTextReaderName( reader );

9.  if( name != NULL ) printf( "element '%s'\n", name );

10. xmlFree( name );

11. }

12.

13. void streamXMLFile(char *filename)

14. {

15. xmlTextReaderPtr reader;

16. int ret = 0;

17.

18. reader = xmlNewTextReaderFilename( filename );

19.

20. if( reader != NULL )

21. {

22.  ret = xmlTextReaderSetParserProp( reader, XML_PARSER_VALIDATE, 1 );

23.    printf( "ret = %d\n", ret );

24.    ret = xmlTextReaderRead( reader );

25.    printf( "ret = %d\n", ret );

26.    while( ret == 1 )

27.    {

28.       processNode( reader );

29.       ret = xmlTextReaderRead( reader );

30.        printf( "ret = %d\n", ret );

31.    }

32.    xmlFreeTextReader(reader);

33. }

34. else printf("Unable to open %s\n", filename);

35. }
--- end of C code ---
 
 
--- example XML file (with DTD error: lines 7. versus 12.) ---
1.  <?xml version="1.0"?>
2.  <!DOCTYPE a [
3.    <!ELEMENT a (b*)>
4.    <!ELEMENT b (c,d)>
5.    <!ATTLIST b r ID #REQUIRED>
6.    <!ELEMENT c (#PCDATA)>
7.    <!ELEMENT d (#PCDATA)>
8.  ]>

9.  <a>
10. <b r="one">
11. <c>un</c>
12. <duh>0.10</duh>
13. </b>

14. <b r="two">
15. <c>deux</c>
16. <d>0.20</d>
17. </b>

18. </a>
--- end of example XML file ---
 
--- screen output (using the DTD incorrect XML file above) ---
ret = 0
test.xml:15: element duh: validity error : No declaration for element duh
<duh>0.10</duh>
               ^
test.xml:16: element b: validity error : Element b content does not follow the DTD, expecting (c , d), got (c d
uh )
</b>
    ^
ret = 1
element 'a'
element 'a'
element '#text'
element 'b'
element '#text'
element 'c'
element '#text'
element 'c'
element '#text'
test.xml:24: element b: validity error : Element b content does not follow the DTD, Misplaced duh
^
test.xml:24: element duh: validity error : No declaration for element duh
^
element 'duh'
element '#text'
element 'duh'
element '#text'
element 'b'
element '#text'
element 'b'
element '#text'
element 'c'
element '#text'
element 'c'
element '#text'
element 'd'
element '#text'
element 'd'
element '#text'
element 'b'
element '#text'
element 'a'
--- end of screen output ---

Question 1: How does one halt the entire process and alert the user (e.g. using the variable ret, maybe streaming the XML screen output to a string buffer) once a first parsing error occurs?
 
Question 2: The screen output indicates that the DTD error (lines 7. versus 12.) is immediately flagged after the first xmlTextReaderRead call (line 24.); if the xmlTextReader only scans one element node at a time, how can it initially flag this error (which only occurs later)? Is it building an entire DOM tree to do this?
 
Thank you in advance for your help!




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]