Although I have contributed to a few (free) scientific applications (in C and F77) for quite some time, I'm new to xml and libxml2 (this may explain the naiveté of my question): I'm writing a C parser for an XML input (input only, no file editing or writing out) - so the xmlTextReader API (http://xmlsoft.org/xmlreader.html#L1142) seems very appropriate. I have gone through the tutorial and am now in the process of integrating DTD validation. The DTD validation correctly flags errors, but I can't seem to add appropriate code to halt the scanning process once a first parsing error occurs (the code below - lightly adapted from the tutorial - simply goes on and on, parsing the entire file despite the initial DTD parsing errors): --- C code --- 1. #include <stdlib.h>2. #include <stdio.h>3. #include <libxml/xmlreader.h>4. 5. static void processNode( xmlTextReaderPtr reader )6. { 7. xmlChar *name, *value;8. name = xmlTextReaderName( reader );9. if( name != NULL ) printf( "element '%s'\n", name ); 10. xmlFree( name ); 11. } 12. 13. void streamXMLFile(char *filename)14. { 15. xmlTextReaderPtr reader; 16. int ret = 0;17. 18. reader = xmlNewTextReaderFilename( filename ); 19. 20. if( reader != NULL ) 21. { 22. ret = xmlTextReaderSetParserProp( reader, XML_PARSER_VALIDATE, 1 ); 23. printf( "ret = %d\n", ret ); 24. ret = xmlTextReaderRead( reader ); 25. printf( "ret = %d\n", ret ); 26. while( ret == 1 ) 27. { 28. processNode( reader ); 29. ret = xmlTextReaderRead( reader ); 30. printf( "ret = %d\n", ret ); 31. } 32. xmlFreeTextReader(reader); 33. } 34. else printf("Unable to open %s\n", filename); 35. }--- end of C code --- --- example XML file (with DTD error: lines 7. versus 12.) --- 1. <?xml version="1.0"?> 2. <!DOCTYPE a [ 3. <!ELEMENT a (b*)> 4. <!ELEMENT b (c,d)> 5. <!ATTLIST b r ID #REQUIRED> 6. <!ELEMENT c (#PCDATA)> 7. <!ELEMENT d (#PCDATA)> 8. ]> 9. <a> 10. <b r="one"> 11. <c>un</c> 12. <duh>0.10</duh> 13. </b> 14. <b r="two"> 15. <c>deux</c> 16. <d>0.20</d> 17. </b> 18. </a> --- end of example XML file --- --- screen output (using the DTD incorrect XML file above) --- ret = 0 test.xml:15: element duh: validity error : No declaration for element duh <duh>0.10</duh> ^ test.xml:16: element b: validity error : Element b content does not follow the DTD, expecting (c , d), got (c d uh ) </b> ^ ret = 1 element 'a' element 'a' element '#text' element 'b' element '#text' element 'c' element '#text' element 'c' element '#text' test.xml:24: element b: validity error : Element b content does not follow the DTD, Misplaced duh ^ test.xml:24: element duh: validity error : No declaration for element duh ^ element 'duh' element '#text' element 'duh' element '#text' element 'b' element '#text' element 'b' element '#text' element 'c' element '#text' element 'c' element '#text' element 'd' element '#text' element 'd' element '#text' element 'b' element '#text' element 'a' --- end of screen output --- Question 1: How does one halt the entire process and alert the user (e.g. using the variable ret, maybe streaming the XML screen output to a string buffer) once a first parsing error occurs? Question 2: The screen output indicates that the DTD error (lines 7. versus 12.) is immediately flagged after the first xmlTextReaderRead call (line 24.); if the xmlTextReader only scans one element node at a time, how can it initially flag this error (which only occurs later)? Is it building an entire DOM tree to do this? Thank you in advance for your help! |