[xml] SAX interface and xmlSubstituteEntitiesDefault()



The swish-e search engine uses libxml2's SAX interface for parsing
HTML and XML.  A swish-e user asked how to get external entities
included in the output from the SAX parser.  For example:

    <?xml version="1.0" encoding="iso-8859-1"?>
    <!DOCTYPE article [
        <!ENTITY xmlfrag SYSTEM "other.data" >
    ]>
    <article>
      &xmlfrag;
    </article>

We are calling xmlSubstituteEntitiesDefault(1); in our code, but the
entity is not magically replaced in the output like I had hoped.

I'm not that experienced with libxml2 -- so I'm not clear why this
should or should not work.  I have seen the warning at the bottom of
http://www.xmlsoft.org/entities.html, but I'm not clear if that
explains why this isn't working or not.  SAX is the preferred
interface for our application.

Can someone help clear up my confusion on this issue?  And is there a
way to get these entities expanded in the SAX parser output?

BTW -- for the above I see this error:

    include.xml:6: error: Entity 'xmlfrag' not defined
      &xmlfrag;
               ^
Is xmlfrag not defined above?


On a related note, I'm using the SAX interface example by James
Henstridge as linked off http://www.xmlsoft.org/interface.html.  Yet,
I now see off the API menu the SAX2 interface and that the one I'm
using is deprecated.  I briefly looked through the changes file and tried
searching (but "sax" is a bit common a search term) for info on the
change to SAX2.  Is there a guide or list discussion someone can point
me to about this change and what is required to move to that
interface (and if/when the old SAX interface might no longer be
supported)?

Thanks very much,


-- 
Bill Moseley
moseley hank org




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]