Re: [xml] why does doc.serialize() not escape & --> & ?



On Mon, Nov 03, 2003 at 12:13:10AM +0200, Hannu Krosing wrote:
this example is done using libxml2 python bindings

import libxml2
libxml2.debugMemory(1)
0
doc = libxml2.newDoc("1.0")
html_content = "<html>&uuml;</html>"
html_content = "<html>&gt;&uuml;</html>"
root = doc.newChild(None, "doc", html_content)
print doc.serialize(None,1)
<?xml version="1.0"?>
<doc>&lt;html&gt;&gt;&uuml;&lt;/html&gt;</doc>
doc.freeDoc()

why < and > are escaped but & is not

  You used the wrong API:

/**
 * xmlNewChild:
 * @parent:  the parent node
 * @ns:  a namespace if any
 * @name:  the name of the child
 * @content:  the XML content of the child if any.
 *
 * Creation of a new child element, added at the end of @parent children list.
 * @ns and @content parameters are optional (NULL). If content is non NULL,
 * a child list containing the TEXTs and ENTITY_REFs node will be created.
 * NOTE: @content is supposed to be a piece of XML CDATA, so it allow entities
 *       references, but XML special chars need to be escaped first by using
 *       xmlEncodeEntitiesReentrant(). Use xmlNewTextChild() if entities
 *       support is not needed.
 *
 * Returns a pointer to the new node object.
 */

  see the note, 

import libxml2
doc = libxml2.newDoc("1.0")
html_content = "<html>&uuml;</html>"
html_content = "<html>&gt;&uuml;</html>"
root = doc.newTextChild(None, "doc", html_content)
print doc.serialize(None,1)
<?xml version="1.0"?>
<doc>&lt;html&gt;&amp;gt;&amp;uuml;&lt;/html&gt;</doc>
 


Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]