Thanks very much for that - I've spent the afternoon looking at the source and examples and your mail beautifully fills in the gaps.
My code is SAX but it's going to be easier to turn it around and make it use the Reader API than it is to try and make SAX and entities work. I thought that I couldn't use the reader because there was no way to give it an external entity loading callback (no access to the xmlParserCtxt or the sax callbacks), but I just found the global SetExternalEntityLoader() function which I expect will let me hook in to provide the parser with the DTD. I'm going to try that, see how well I get on.
Thanks again. On Fri, Nov 16, 2012 at 12:36:14AM +0800, Roland King wrote:
I feel sure this is simple but after a day of reading code and googling I'm not getting it.
No it is not simple. SAX and entities handling is not simple in libxml2. See the warning in red at the bottom of http://xmlsoft.org/entities.html
I want to implement a resolveEntity callback on a simple SAX parser in libxml2 so I can supply a DTD. I have the following
xmlSAXHandlerPtr hdlrPtr = calloc( 1, sizeof( xmlSAXHandler ) ); hdlrPtr->startDocument = &startDocument; hdlrPtr->endDocument = &endDocument; hdlrPtr->resolveEntity = &resolveEntity; int result = xmlSAXUserParseMemory( hdlrPtr, NULL, buf, size );
my resolveEntity code is never called even when fed a document which has a SYSTEM entity in the DOCTYPE. The startDocument and endDocument callbacks are called.
<!DOCTYPE testxml SYSTEM "TestXML.dtd"> ...
Is this the wrong callback? I see the xmlParserOption enum which includes DTD loading and validation but there's nothing in the simple SAX interface which seems to use that, it appears to be a level further up, I'm just trying to use xmlSAXUserParseMemory() and calls at that level. Since the handler contains a resolveEntity function I would have expected it to be called.
What basic misunderstanding do I have here?
By default libxml2 XML parser does not fetch any of the external subset. You really need to give the DTD loading or validation option. You can't do that with xmlSAXUserParseMemory() Create a parser context, then set your SAX callback block and then call xmlCtxtReadMemory() . There is no convenient APIs for doing this, SAX processing and entities handling are difficult to handle properly.
Unless you really have existing code relying on SAX, I really suggest to instead look at using the Reader API,
Daniel
-- Daniel Veillard | Open Source and Standards, Red Hat veillard redhat com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/
|