Re: [xml] ampersand conversion
- From: Igor Zlatkovic <igor zlatkovic com>
- To: oliverst online de
- Cc: xml gnome org
- Subject: Re: [xml] ampersand conversion
- Date: Tue, 17 May 2005 21:22:11 +0200
On 17.05.2005 15:20, oliverst online de wrote:
Hi,
this code is giving me "error : unterminated entity reference" with
libxml2-2.6.16
xmlNodePtr node = xmlNewNode(NULL, BAD_CAST "test");
if( node ) {
const char* str = "<>\"'&";
xmlNodeSetContent(node, BAD_CAST str);
}
I know, that some specific character have to be encoded/quoted, but all
others are already being encoded internally and I would just have to
replace the ampersand by myself. Is this a wanted behavior? Because it
seems a bit inconsistent to me encoding all the others, but not the
ampersand.
I haven't checked the code, so this is without warranty, but this is how
I see it. xmlNodeSetContent creates a child node of type text and puts
the supplied content in it. It does not parse the supplied content as
XML and then inject the result into the tree. For the latter to work in
any sane way, the supplied content would have to be at least a valid
document fragment.
I cannot use a '>' or a '<' within a text node, so these are reasonably
quoted. But I can very well use entity references or character
references and expect them to be handled accordingly. Therefore, leaving
ampersands alone when quoting things in a text node is only sane,
because I would have no chance to insert an entity reference, or a
character reference, in the newly created text child otherwise.
For consistency, if at all, libxml could require you to use a
const char* str = "<>\"'&";
instead of what you have used, I feel this way would be a lot more safe
than automagically supplying an & for every ampersand that triggers
the unterminated entity reference error.
Ciao,
Igor
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]