Re: [xml] Useless function calls in xmlSetProp()?
- From: Daniel Veillard <veillard redhat com>
- To: Julien Charbon <jch 4js com>
- Cc: xml gnome org
- Subject: Re: [xml] Useless function calls in xmlSetProp()?
- Date: Fri, 25 Jan 2008 08:47:40 -0500
On Fri, Jan 25, 2008 at 02:39:22PM +0100, Julien Charbon wrote:
Daniel Veillard wrote:
On Fri, Jan 25, 2008 at 11:33:05AM +0100, Julien Charbon wrote:
Hi all,
it's seems that function calls:
buffer = xmlEncodeEntitiesReentrant(doc, value)
list = xmlStringGetNodeList(doc, buffer);
can be exactly replaced by a simple:
list = xmlNewDocText(doc, value);
You will find theses calls in tree.c. More precisely in
xmlNewPropInternal() and in xmlSetNsProp(), both called by xmlSetProp().
In fact all that xmlEncodeEntitiesReentrant() does, is exactly
undone by xmlStringGetNodeList(). There is any
technical/practical/historical reasons to keep these calls in tree.c?
Below a patch that do this replacement on current trunk. [Just to
illustrate my concern]. Our application and libxml2's "make tests" are
happy with this change.
I don't believe the patch is right because an attribute
list of children can be list of text and entities references,
and well your patch reduces it to just the case where you don't have an
entity reference in attribute values. Even if broken parser APIs
like SAX let people believe that attribute values can only be made
of one text node, this is not true from the spec POV and libxml2 which
was designed as an editing toolkit allows maintains entities references
in attribute values.
Thanks for your fast and clear answer, I am totally agree with it,
but... With the current implementation, and in this case:
(1) buffer = xmlEncodeEntitiesReentrant(doc, value)
(2) list = xmlStringGetNodeList(doc, buffer);
xmlStringGetNodeList() will always return a list with only one
XML_TEXT_NODE element because xmlEncodeEntitiesReentrant() escape all
'&' in '&'. In clear, if value is "&myent;", after (1) buffer will
be set to "&myent;" and after (2) list will contain only one
XML_TEXT_NODE element with its content set to "&myent;".
Thus:
"&myent;" -> (1) -> "&myent;" -> (2) -> "&myent;"
argh, right .... I'm afraid the escaping has been added as an afterthought
it was not supposed to be that way, oh well, one can still build the
complex attrubute values 'by hand' with the help of the API, but I think
somehow we defeated the initial purpose for the xmlStringGetNodeList() call
It's give to me with current libxml2 trunk:
$ gcc test-xml-tiny.c -o test-xml-tiny $(xml2-config --cflags) \
$(xml2-config --libs)
$ ./test-xml-tiny
Only one element in return of xmlStringGetNodeList
&foo; &bar; & <tag> &myent </tag> &&
No change. Maybe, historically, it was not always the case...
Hum, yes. The only other thing that your suggested change would loose
are the error message resulting from the validations occuring in
xmlEncodeEntitiesReentrant() , problems reported there would go unnoticed
otherwise. Is that still worth the extra complexity or not, I'm not sure.
Daniel
--
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard | virtualization library http://libvirt.org/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]