[xml] Entity Rference replacement problem in Canonical module



Hello Sir,
 
I think I couldn't communicate the problem properly. The Problem is
 
Application developer may set these variable(xmlLoadExtDtdDefaultValue = XML_DETECT_IDS | XML_COMPLETE_ATTRS;
xmlSubstituteEntitiesDefault(1); )
or may not. If he sets this flag then only Our Parser will replace entity reference and DTD attribute. My doubt is If he doesn't sets this flag before parser then our Canonical module will not replace the Entity reference and DTD atribute.
What might be the Solution.
See the LibXML code
 case XML_ENTITY_REF_NODE:
   xmlC14NErrInvalidNode("XML_ENTITY_REF_NODE", "processing node");
            return (-1);
 
It does not having Entity reference replacement code. But Specification says that Canonical Module will replace the Entity reference.
 
/**
 * xmlC14NProcessNode:
 * @ctx:   the pointer to C14N context object
 * @cur:  the node to process
 *    
 * Processes the given node
 *
 * Returns non-negative value on success or negative value on fail
 */
static int
xmlC14NProcessNode(xmlC14NCtxPtr ctx, xmlNodePtr cur)
{
    int ret = 0;
    int visible;
    if ((ctx == NULL) || (cur == NULL)) {
  xmlC14NErrParam("processing node");
        return (-1);
    }
    visible = xmlC14NIsVisible(ctx, cur, cur->parent);
    switch (cur->type) {
        case XML_ELEMENT_NODE:
            ret = xmlC14NProcessElementNode(ctx, cur, visible);
            break;
        case XML_CDATA_SECTION_NODE:
        case XML_TEXT_NODE:
            /*
             * Text Nodes
             * the string value, except all ampersands are replaced
             * by &amp;, all open angle brackets (<) are replaced by &lt;, all closing
             * angle brackets (>) are replaced by &gt;, and all #xD characters are
             * replaced by &#xD;.
             */
            /* cdata sections are processed as text nodes */
            /* todo: verify that cdata sections are included in XPath nodes set */
            if ((visible) && (cur->content != NULL)) {
                xmlChar *buffer;
                buffer = xmlC11NNormalizeText(cur->content);
                if (buffer != NULL) {
                    xmlOutputBufferWriteString(ctx->buf,
                                               (const char *) buffer);
                    xmlFree(buffer);
                } else {
     xmlC14NErrInternal("normalizing text node");
     return (-1);
                }
            }
            break;
        case XML_PI_NODE:
            /*
             * Processing Instruction (PI) Nodes-
             * The opening PI symbol (<?), the PI target name of the node,
             * a leading space and the string value if it is not empty, and
             * the cl osing PI symbol (?>). If the string value is empty,
             * then the leading space is not added. Also, a trailing #xA is
             * rendered after the closing PI symbol for PI children of the
             * root node with a lesser document order than the document
             * element, and a leading #xA is rendered before the opening PI
             * symbol of PI children of the root node with a greater document
             * order than the document element.
             */
            if (visible) {
                if (ctx->pos == XMLC14N_AFTER_DOCUMENT_ELEMENT) {
                    xmlOutputBufferWriteString(ctx->buf, "\x0A<?");
                } else {
                    xmlOutputBufferWriteString(ctx->buf, "<?");
                }
                xmlOutputBufferWriteString(ctx->buf,
                                           (const char *) cur->name);
                if ((cur->content != NULL) && (*(cur->content) != '\0')) {
                    xmlChar *buffer;
                    xmlOutputBufferWriteString(ctx->buf, " ");
                    /* todo: do we need to normalize pi? */
                    buffer = xmlC11NNormalizePI(cur->content);
                    if (buffer != NULL) {
                        xmlOutputBufferWriteString(ctx->buf,
                                                   (const char *) buffer);
                        xmlFree(buffer);
                    } else {
      xmlC14NErrInternal("normalizing pi node");
      return (-1);
                    }
                }
                if (ctx->pos == XMLC14N_BEFORE_DOCUMENT_ELEMENT) {
                    xmlOutputBufferWriteString(ctx->buf, "?>\x0A");
                } else {
                    xmlOutputBufferWriteString(ctx->buf, "?>");
                }
            }
            break;
        case XML_COMMENT_NODE:
            /*
             * Comment Nodes
             * Nothing if generating canonical XML without  comments. For
             * canonical XML with comments, generate the opening comment
             * symbol (<!--), the string value of the node, and the
             * closing comment symbol (-->). Also, a trailing #xA is rendered
             * after the closing comment symbol for comment children of the
             * root node with a lesser document order than the document
             * element, and a leading #xA is rendered before the opening
             * comment symbol of comment children of the root node with a
             * greater document order than the document element. (Comment
             * children of the root node represent comments outside of the
             * top-level document element and outside of the document type
             * declaration).
             */
            if (visible && ctx->with_comments) {
                if (ctx->pos == XMLC14N_AFTER_DOCUMENT_ELEMENT) {
                    xmlOutputBufferWriteString(ctx->buf, "\x0A<!--");
                } else {
                    xmlOutputBufferWriteString(ctx->buf, "<!--");
                }
                if (cur->content != NULL) {
                    xmlChar *buffer;
                    /* todo: do we need to normalize comment? */
                    buffer = xmlC11NNormalizeComment(cur->content);
                    if (buffer != NULL) {
                        xmlOutputBufferWriteString(ctx->buf,
                                                   (const char *) buffer);
                        xmlFree(buffer);
                    } else {
      xmlC14NErrInternal("normalizing comment node");
      return (-1);
                    }
                }
                if (ctx->pos == XMLC14N_BEFORE_DOCUMENT_ELEMENT) {
                    xmlOutputBufferWriteString(ctx->buf, "-->\x0A");
                } else {
                    xmlOutputBufferWriteString(ctx->buf, "-->");
                }
            }
            break;
        case XML_DOCUMENT_NODE:
        case XML_DOCUMENT_FRAG_NODE:    /* should be processed as document? */
#ifdef LIBXML_DOCB_ENABLED
        case XML_DOCB_DOCUMENT_NODE:   /* should be processed as document? */
#endif
#ifdef LIBXML_HTML_ENABLED
        case XML_HTML_DOCUMENT_NODE:   /* should be processed as document? */
#endif
            if (cur->children != NULL) {
                ctx->pos = XMLC14N_BEFORE_DOCUMENT_ELEMENT;
                ctx->parent_is_doc = 1;
                ret = xmlC14NProcessNodeList(ctx, cur->children);
            }
            break;
        case XML_ATTRIBUTE_NODE:
   xmlC14NErrInvalidNode("XML_ATTRIBUTE_NODE", "processing node");
            return (-1);
        case XML_NAMESPACE_DECL:
   xmlC14NErrInvalidNode("XML_NAMESPACE_DECL", "processing node");
            return (-1);
        case XML_ENTITY_REF_NODE:
   xmlC14NErrInvalidNode("XML_ENTITY_REF_NODE", "processing node");
            return (-1);
        case XML_ENTITY_NODE:
   xmlC14NErrInvalidNode("XML_ENTITY_NODE", "processing node");
            return (-1);
        case XML_DOCUMENT_TYPE_NODE:
        case XML_NOTATION_NODE:
        case XML_DTD_NODE:
        case XML_ELEMENT_DECL:
        case XML_ATTRIBUTE_DECL:
        case XML_ENTITY_DECL:
#ifdef LIBXML_XINCLUDE_ENABLED
        case XML_XINCLUDE_START:
        case XML_XINCLUDE_END:
#endif
            /*
             * should be ignored according to "W3C Canonical XML"
             */
            break;
        default:
   xmlC14NErrUnknownNode(cur->type, "processing node");
            return (-1);
    }
    return (ret);
}

 
Regards
Gopa

xml-request gnome org wrote:
Send xml mailing list submissions to
xml gnome org

To subscribe or unsubscribe via the World Wide Web, visit
http://mail.gnome.org/mailman/listinfo/xml
or, via email, send a message with subject or body 'help' to
xml-request gnome org

You can reach the person managing the list at
xml-owner gnome org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of xml digest..."


Today's Topics:

1. Entity Rference replacement problem in Canonical module
(gopabandhu patra)


----------------------------------------------------------------------

Message: 1
Date: Thu, 30 Jun 2005 04:45:14 -0700 (PDT)
From: gopabandhu patra
Subject: [xml] Entity Rference replacement problem in Canonical module
To: xml gnome org
Message-ID: <20050630114514 95760 qmail web30508 mail mud yahoo com>
Content-Type: text/plain; charset="iso-8859-1"

Hello Sir,

I've a doubt that
We are very sure that Canonical module takes a Document pointer as an input which can be get after parsing of XML document. Now the doubt is If user has not set below value before parsing
xmlLoadExtDtdDefaultValue = XML_DETECT_IDS | XML_COMPLETE_ATTRS;
xmlSubstituteEntitiesDefault(1);

The Parser will not replace the Entity reference. The Canonical module is returning the error in this case.
But As per the specification, Character and parsed entity references are replaced with the literal characters (excepting special characters).
Can you tell me How can I overcome this problem.

Example:

]>&ent1;, &ent2;!



In the above case If I'm setting this two things

xmlLoadExtDtdDefaultValue = XML_DETECT_IDS | XML_COMPLETE_ATTRS;
xmlSubstituteEntitiesDefault(1);

Then canonical module is working fine or else returning error.

Can you say me what might be the solution?.



Regards

Gopa




---------------------------------
Yahoo! Sports
Rekindle the Rivalries. Sign up for Fantasy Football
-------------- next part --------------
An HTML attachment was scrubbed...
URL: /archives/attachments/20050630/faf43906/attachment.htm

------------------------------

_______________________________________________
xml mailing list
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml


End of xml Digest, Vol 14, Issue 33
***********************************

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]