[xml] Predefined Entities Reference problem in SAX parsing


  When I perform a sax parsing on the below xml file

   <?xml version="1.0"?>

       <object DN="&amp;&lt;;DN;&gt;@&amp;#&lt;;Attribute;&gt;" operation="modify">


In the startelementns callback for trigger element the value of the attribute DN was $#38;<;DN;> @&#38;#<;Attribute;>

For all the predefined entities references it replace the corresponding character but for amp it

replace again with a character reference.

As of XML1.0 specification "if a general entity reference appears in the value of an attribute its replacement text MUST be processed in place of the reference itself."

All the 5 predefined entities in XML are basically parsed general entities. When I check the code I found that following piece of code in xmlParseAttValueComplex () make the difference

ent = xmlParseEntityRef (ctxt);

if ((ent != NULL) &&

       (ent->etype == XML_INTERNAL_PREDEFINED_ENTITY)) {

       if (len > buf_size - 10) {

              growBuffer (buf, regmemhdl);


       if ((ctxt->replaceEntities == 0) &&

              (ent->content[0] == '&')) {

              buf[len++] = '&';

              buf[len++] = '#';

              buf[len++] = '3';

              buf[len++] = '8';

              buf[len++] = ';';

       } else {

              buf [len++] = ent->content [0];


For all other predefined entities apart from amp irrespective of the ctxt->replaceEntities

the corresponding character is replaced. Similarly any internal general entity reference occurring in the character content are also replaced

by their replacement text irrespective of the value of ctxt->replaceEntities. Can anybody please describe why an amp entity reference is handled in this manner in SAX?

Thanks and Regards


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]