According to the XML spec the for the following entity declarations <!ENTITY d "
"> <!ENTITY a "
"> <!ENTITY da "
">
If we have an attr of type NMToken the normalization for attr a a="&d;&d;A&a; &a;B&da;" should be a = A #x20 B.
The above example is directly from the spec, however libxml normalizes the attribute to the following A = #x0D #x0D A #x0A #x20 #x0A B.
This seems to be a bug.
Also suppose I have a document with an attribute value consisting of an entity reference defined in a DTD, normalization does not happen at all for the replacement text of the entity reference.
For instance I have an xml doc which has the following element, attr1 is of type other than CDATA <element attr1=”&normalize; &normalize; &normalize;“> Two Spaces between entity ref’s Where <!ENTITY normalize " test&norm; "> ----Single space before and after entity text <!ENTITY norm “hi”> So the normalized value for the above should be Attr1 = “testhi testhi testhi”, Single space only between replacement text Libxml however returns attr1=” testhi testhi testhi ”.---Single space after the beginning Quote and before the ending quote. Three spaces between entity replacement text. That’s it is not getting normalized properly, this too seems to be a bug…..The only doubt I have is whether entity replacement text is exempt from normalization? The spec however does not say so anywhere.
Regards Ashwin Sinha
|
Attachment:
AttrValNorm.txt
Description: Text document