RE: [xml] Substitution of nested entity references




-----Original Message-----
From: Daniel Veillard [mailto:veillard redhat com]
Sent: Wednesday, April 10, 2002 6:03 PM
To: Henke, Markus
Cc: 'xml gnome org'
Subject: Re: [xml] Substitution of nested entity references


On Wed, Apr 10, 2002 at 05:24:25PM +0200, Henke, Markus wrote:
Maybe i've expressed myself unclear. The already built tree contains
XML_ENTITY_REF_NODEs as part of content of element nodes or 
attribute
values. As above mentioned, e.g. a call to xmlGetProp() on such an
attribute
(e.g. <aElement aAttr="Text &i_contain_an_entity_ref; MoreText" />)
results in something like
value = "Text &i_am_a_nested_entity_ref; MoreText"
which isn't the expected result.

<cut />
 
  So it should *not* return a value = "Text 
&i_am_a_nested_entity_ref; MoreText"
value ! And a simple script check confirms this:

paphio:~/XML -> python
Python 1.5.2 (#1, Jul  5 2001, 03:02:19)  [GCC 2.96 20000731 
(Red Hat Linux 7.1 2 on linux-i386
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
import libxml2
doc = libxml2.parseFile("tst.xml") 
root = doc.getRootElement()
print root.prop("aAttr")
Text content MoreText
 
Now i see! We talk at cross purposes resp. have different
ideas of "nested entity references".
Try your check with a document like

=============================================================
<?xml version="1.0" encoding="iso-8859-1" standalone="yes" ?>

<!DOCTYPE aElement [
  <!ELEMENT aElement ANY>
  <!ATTLIST aElement aAttr CDATA #REQUIRED>
  <!ENTITY aNestedEnt "i_am_a_nested_entity">
  <!ENTITY aEnt "&aNestedEnt;">
]>

<aElement aAttr="Text &aEnt; MoreText">
  Content &aEnt; MoreContent
</ aElement>
=============================================================

You'll get "> Text &aNestedEnt; MoreText" !!!
*That's* what i talk about all the time... 8)


So why are you trying to remove them. No really I can't 
understand.

I wouldn't too, but i don't try to do so.
I want to accumulate the (useful) content of the different node
types (as child nodes of an entity declaration) with regard to the
case where the child node again is an entity reference (and do a
recursive call to xmlResolveEntityRef() in that case).
It's all about to enable routines like xmlGetProp(),
xmlNodeGetContent(), ... to result in a *complete* substitution
of an entity reference. 

  Okay, so you're just gathering content of the subtree. It seems to
me that xmlNodeGetContent() does just this. the only change which may
be needed is to change the handling of XML_ENTITY_REF_NODE 
nodes to recurse
in the node->children when the node found is such an entity ref.

Yep, at the moment it just gathers "ent->content", where "ent" is 
the corresponding entity declaration, it doesn't care about what
child nodes (e.g. nested entity refs) the declaration includes.
Therefore xmlNodeGetContent() has an similar behavior as xmlGetProp().
With an xml document like the one above you'll get

  xmlNodeGetContent(xmlDocGetRootElement(docPtr)))
  == "Content &aNestedEnt; MoreContent";

which IMHO isn't correct (resp. not what i would've expected).

Maybe you could revise my approach under this new perspective
(provided that i'm right and we've talked at cross purposes
before)?
 
Daniel

Thanx again, Markus




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]