Re: [xml] Behaviour of xmlNodeAddContent() vs. xmlNodeSetContent()



Hi Daniel,

OK, I've updated the function comment of xmlNodeSetContent()
with a note similar to xmlNewDocNode(), which works on the
same level AFAICS, something along

* NOTE: @content is supposed to be a piece of XML CDATA, so it allow entities
*       references, but XML special chars need to be escaped first by using
*       xmlEncodeEntitiesReentrant() resp. xmlEncodeSpecialChars().

Additoinaly, xmlNodeAddContent() has a note on the different behaviour
WRT xmlNodeSetContent() and an explicit hint *not* to call
xmlEncodeEntitiesReentrant().

I would've sent the patch, but while going through the source
of these calls, I noticed another, maybe more serious issue.
The note above (which appears in the docs for several API calls,
e.g. xmlNewDocNode() and xmlNewChild()) is not correct IMHO,
these calls provide no entity support, at least not for arbitrary
input.
As correctly documented, you've to call xmlEncodeEntitiesReentrant(),
resp. xmlEncodeSpecialChars(), since special XML characters has
to be replaced on that level. But a call to these function will
also replace the ampersand of a possible entity reference in the
content buffer.
I've tested it, e.g. calling
xmlNodeSetContent(node, BAD_CAST "&myEnt;")
with a declared entity "myEnt" will create an XML_ENTITY_REF_NODE
child node with name "myEnt" and the declared content, I'd say that's
what's meant with entity support.
But for arbitrary content, we've to call xmlEncodeEntitiesReentrant()
first, and calling
xmlNodeSetContent(node,
 xmlEncodeEntitiesReentrant(BAD_CAST "&myEnt;"))
will result in an XML_TEXT_NODE child node with content "&myEnt;",
this will be serialized to "&myEnt;" if we dump the node to disk!

I first thought that I must be wrong, cause this would be quite
a concern, but I've tested it and get the above mentioned behaviour.

Any thoughts?


Ciao, Markus



-----Ursprüngliche Nachricht-----
Von: Daniel Veillard [mailto:veillard redhat com]
Gesendet: Mittwoch, 25. Oktober 2006 14:04
An: Keim, Markus
Cc: xml gnome org
Betreff: Re: [xml] Behaviour of xmlNodeAddContent() vs.
xmlNodeSetContent()


On Wed, Oct 25, 2006 at 02:51:32PM +0200, Keim, Markus wrote:
Hi Daniel,

well, comparing the documentation of both calls
(I've actually done so before asking) doesn't make
that different behaviour *that* obvious.

  Agreed, xmlNodeAddContent is lower level, working at the text node
string level while xmlNodeSetContent is more at the 
serialization level.
A lot of this is just history, not always very rational, and nearly
impossible to change. Though if you want to submit a patch to make 
the function comments more explicit, sure !

However, a statement of you that both calls behave as intended
is abolutely adequate (and appreciated) for my issue, I'd have
to adjust the application, then.

So, if I got this right, that would mean to
- mandatory pass user input through xmlEncodeEntitiesReentrant()
  (resp. xmlEncodeSpecialChars()) before calling
  xmlNodeSetContent()

- mandatory NOT do so before calling xmlNodeAddContent()
?

  Yes that's my understanding :-)

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard redhat com  | libxml GNOME XML XSLT toolkit  
http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]