Re: [xml] Substitution of nested entity references



On Wed, Apr 10, 2002 at 07:16:38PM +0200, Henke, Markus wrote:
Now i see! We talk at cross purposes resp. have different
ideas of "nested entity references".
Try your check with a document like

=============================================================
<?xml version="1.0" encoding="iso-8859-1" standalone="yes" ?>

<!DOCTYPE aElement [
  <!ELEMENT aElement ANY>
  <!ATTLIST aElement aAttr CDATA #REQUIRED>
  <!ENTITY aNestedEnt "i_am_a_nested_entity">
  <!ENTITY aEnt "&aNestedEnt;">
]>

<aElement aAttr="Text &aEnt; MoreText">
  Content &aEnt; MoreContent
</ aElement>
=============================================================

You'll get "> Text &aNestedEnt; MoreText" !!!
*That's* what i talk about all the time... 8)

  Argh, apologies, 

import libxml2
doc = libxml2.parseFile("tst.xml")
root = doc.getRootElement()
print root.prop("aAttr")
Text &aNestedEnt; MoreText


  Okay, so you're just gathering content of the subtree. It seems to
me that xmlNodeGetContent() does just this. the only change which may
be needed is to change the handling of XML_ENTITY_REF_NODE 
nodes to recurse
in the node->children when the node found is such an entity ref.

Yep, at the moment it just gathers "ent->content", where "ent" is 
the corresponding entity declaration, it doesn't care about what
child nodes (e.g. nested entity refs) the declaration includes.
Therefore xmlNodeGetContent() has an similar behavior as xmlGetProp().
With an xml document like the one above you'll get

  xmlNodeGetContent(xmlDocGetRootElement(docPtr)))
  == "Content &aNestedEnt; MoreContent";

which IMHO isn't correct (resp. not what i would've expected).

  Exact,

Maybe you could revise my approach under this new perspective
(provided that i'm right and we've talked at cross purposes
before)?

  yes, simply fix xmlNodeGetContent() and xmlNodeListGetString(,,1) to
behave correctly in this case. I actually think it is a libxml2 bug and
I would appreciate if you could take care of it since you have explored
much of it. Simply recurse on entity refs content to complete the work
currently handled only at level 1 of entity indirection. As explained
before this shall not loop since it would result in a well formedness
error to have a loop in entities references.

  thanks !

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]