Re: [xml] resolveEntity and SAX



Thanks very much for that - I've spent the afternoon looking at the source and examples and your mail beautifully fills in the gaps.  

My code is SAX but it's going to be easier to turn it around and make it use the Reader API than it is to try and make SAX and entities work. I thought that I couldn't use the reader because there was no way to give it an external entity loading callback (no access to the xmlParserCtxt or the sax callbacks), but I just found the global SetExternalEntityLoader() function which I expect will let me hook in to provide the parser with the DTD. I'm going to try that, see how well I get on. 

Thanks again. 

On 16 Nov, 2012, at 2:53 PM, Daniel Veillard <veillard redhat com> wrote:

On Fri, Nov 16, 2012 at 12:36:14AM +0800, Roland King wrote:
I feel sure this is simple but after a day of reading code and googling I'm not getting it.

 No it is not simple. SAX and entities handling is not simple in
 libxml2. See the warning in red at the bottom of
  http://xmlsoft.org/entities.html

I want to implement a resolveEntity callback on a simple SAX parser in libxml2 so I can supply a DTD. I have the following

xmlSAXHandlerPtr hdlrPtr = calloc( 1, sizeof( xmlSAXHandler ) );

hdlrPtr->startDocument = &startDocument;
hdlrPtr->endDocument = &endDocument;
hdlrPtr->resolveEntity  = &resolveEntity;

int result = xmlSAXUserParseMemory( hdlrPtr, NULL, buf, size );

my resolveEntity code is never called even when fed a document which has a SYSTEM entity in the DOCTYPE. The startDocument and endDocument callbacks are called.

<!DOCTYPE testxml SYSTEM "TestXML.dtd">
...

Is this the wrong callback? I see the xmlParserOption enum which includes DTD loading and validation but there's nothing in the simple SAX interface which seems to use that, it appears to be a level further up, I'm just trying to use xmlSAXUserParseMemory() and calls at that level. Since the handler contains a resolveEntity function I would have expected it to be called.

What basic misunderstanding do I have here?

 By default libxml2 XML parser does not fetch any of the external
 subset. You really need to give the DTD loading or validation option.
You can't do that with xmlSAXUserParseMemory()
Create a parser context, then set your SAX callback block and then
call xmlCtxtReadMemory() . There is no convenient APIs for doing this,
SAX processing and entities handling are difficult to handle properly.

 Unless you really have existing code relying on SAX, I really suggest
to instead look at using the Reader API,

Daniel

--
Daniel Veillard      | Open Source and Standards, Red Hat
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]