Re: [xml] Substitution of nested entity references



On Mon, Apr 08, 2002 at 01:25:44PM +0200, Henke, Markus wrote:
Hello Daniel,

and sorry to broach that topic again, but i'd hoped
that the approach mentioned below is possibly worth
a comment...  8)

  Okay, let's see ...

Kidding asside, i really won't to press for something
or get on someone's nerves. It's just that my
(libxml2 based) application (an API for COBOL) is
short to a first release version and i've to decide
whether i've to handle the substitution of nested
entity references on application level or to rely
on libxml.
If it's not intended to do the substitution in
libxml, it's maybe a good idea to point this out in
the docs/reference for xmlNodeGetContent(),
xmlGetProp() and the like!?


TIA, Markus

-----Original Message-----
From: Henke, Markus 
Sent: Wednesday, March 20, 2002 4:08 PM
To: 'xml gnome org'
Subject: RE: [xml] xmlNodeGetContent() for XML_ENTITY_REF_NODE


Hello,

and please apologize the long response time, but there
where other things to take care of (among them some holidays...8).

In the meantime i've written a routine that does the (recursive)
substitution for (nested) entitiy reference, and it works fine for
me so far (code included below).
Anyhow, there are some points to clarify before it may be usefull
for libxml2:

- Is it possible that someone can get a xmlDocPtr to a non
  well-formed (possibly due to cyclic entity references)
  document? This could cause serious problems in this
  routine... 8)

  Not possible, well at  least for a document which is the result of the
parsing process.

- Is xmlGetDocEntity() the proper way to lookup the entity
  declaration in that case? There is also xmlGetDtdEntity(),
  i'm not shure about the difference. Both take a xmlDocPtr
  as parameter (and i must confess that i haven't had the
  time by now to examine the source for that...)

  Use xmlGetDocEntity() it will look in both the internal subset and
external subset for the entities definitions.

- I've assumed that a xmlEntity have only child nodes with
  type XML_ENTITY_REF_NODE or XML_TEXT_NODE. Is this correct,
  and if not, which additional cases were to take into
  consideration?

  Well an entity content can be any "well balanced chunk", i.e.
the result of the content [43] production:

    http://www.w3.org/TR/REC-xml#NT-content

I would be glad to improve this routine if there is any
chance that it could be used in libxml2 to resolve nested
entity references, i guess at least xmlNodeGetContent and
xmlGetProp() [xmlGetNsProp()] should do so, the reference
says
... Entity references are substituted... e.g.
... This does the entity substitution...


  Hum, the best would be to testcase xmlResolveEntityRef() 
against the result of running the parser with entity substitution
turned on. If the result were the same for the set of files in the
test suite, that would be an indication that the code is good
enough for inclusion directly in libxml2.

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]