[xml] Parsing XML in embedded environment


I've been struggling with this for two whole days. I have read the whole of libxml docs and the library source, but I'm not any further.

We're writing an app for a small resource-limited device. It parses XML files. We don't need validation or any extras. Just get at the contents. The problem is that by default, libxml knows only the basic 5 XML entities. We cannot use the library's default file system DTD "catalog", because basically, there's no filesystem like that.

So I'm looking at how to implement this dynamically. Everything has to be dynamic, no file or even network access. We have to keep it as simple & fast as possible, so we're not using callbacks like SAX interface has, etc. We simply call "*xmlReadMemory*" and then recursively loop through the returned "*xmlDoc*" and get what we need. The problem is that libxml doesn't know our custom entities, so the parsed text is not correct. I've been hitting the wall ever since. I just can't find (or even think how to add it by modifying the library) any way of supplying libxml with the extra entities. Be it programmatically, via a function defining new entities, or by letting libxml pre-load our DTD that contains all the definitions (that would be the preferred way).

I've tried "xmlAddDocEntity(xmlDocPtr,...)", but that's obviously available only *after* the document is created/parsed and that's too late. We get "undefined entity" errors *during* parsing. Somehow, I need to setup libxml with a DTD, then call xmlReadDoc/Memory, and finally walk the returned tree and extract the tags we need. I'm sure I must be overlooking something, because this is a reasonable requirement - many apps can't access the file system and/or the network to allow libxml on-the-fly DTD access based on PUBLIC ID's and external catalogs.

Can somebody here help us solve this?

Thank you very much,

Dave Lister

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]