Re: [xml] Parsing XML in embedded environment
- From: David Kubicek <foceni gmail com>
- To: liam w3 org
- Cc: xml gnome org
- Subject: Re: [xml] Parsing XML in embedded environment
- Date: Sat, 11 Jun 2011 10:56:57 +0200
On 06/11/2011 04:01 AM, Liam R E Quin wrote:
On Sat, 2011-06-11 at 01:02 +0200, David Kubicek wrote:
The problem is that by default, libxml knows only the basic 5 XML
entities.
Why is this a problem?
That is not a problem per se. It's correct, I'm just pointing it out to
provide full background.
XML documents must either stick to those entities or define the ones
they want to use, so you should not predefine others.
I don't want to predefine others. I have an XML document (with correct
DOCTYPE) and a DTD that contains all entitiy definitions. I just can't
get libxml to load ("to know") this associated DTD before I start
parsing the XML. Unfortunately, there isn't much actual documentation,
it's just a list of prototypes.
I just can't find (or even think
how to add it by modifying the library) any way of supplying libxml with
the extra entities.
One way might be to override the entity resolver.
Yes, I looked at it some time ago, but it doesn't seem to work and
doesn't have *any* documentation. Apparently, one needs to return
"xmlParserInputPtr", which is only created by "xmlNewInputFromFile".
There is no "xmlNewInputFromMem". Plus, "xmlNewInputFromFile" isn't
documented. Nor could I find how to create a usable "xmlParserInputPtr"
from a memory buffer.
So I jumped in the libxml source. I wrote my own "xmlNewInputFromMem" by
mimicking "xmlNewInputFromFile". That is, internally calling something
like this:
buf = xmlParserInputBufferCreateMem();
input = xmlNewInputStream();
input->buf = buf;
return input;
With proper setup and initialization, just like "xmlNewInputFromFile"
does. Then, in main(), before calling "xmlReadMemory", I registered my
own external entity loader like this:
xmlSetExternalEntityLoader(myLoader);
Function "myLoader" prints the passed URL & ID, so that I could see what
libxml is trying to load. If it's loading the DTD specified in our
XML's, etc. But after all this setup and extending libxml source
("xmlNewInputFromMem") and registering the handler - nothing happens.
myLoader() is never called. The following xmlReadMemory() prints the
same errors about undefined entities.
Why not just define the entities in the document?
We receive the XML/DTD from a third party. We cannot hack around this,
the solution should be fast & clean.
Thank you for your quick reply. If you could help me with the external
loader, that would be great.
--
Dave Lister
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]