[xml] Does libxml2 always does character entity substitution for attribute value?



Hello,

I want to read the content of HTML file as it is. i.e It should not substitute entity references in attribute value.

I'm using xmlGetProp() API to read the content of attribute. However I'm not getting original value of attribute.
Even I tried to retrieve the content of attribute using "xmlAttrPtr->children->content", but still it return entity substituted data.


e.g. I have following HTML data to parse:
char * data = "" var =&quot;myvar&quot;;' class=\"123\" damn=\"345\"></body></html>"

Now If I try to parse above data I'm getting following attribute value data:

attribute name: onload
attribute value: self.focus(); var ="myvar";    => Here libxml2 should return attribute value :  self.focus(); var =&quot;myvar&quot;;

attribute name: class
attribute value: 123

attribute name: damn
attribute value: 345


Do you How to disable entity substitution in attribute value during reading attribute value?








Thanks and Regards,
Bala




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]