[xml] Attr Value Normalization Problem

According to the XML spec the for the following entity declarations

<!ENTITY d "&#xD;">

<!ENTITY a "&#xA;">

<!ENTITY da "&#xD;&#xA;">


If we have an attr of type NMToken the normalization for attr a


should be

a = A #x20 B.


The above example is directly from the spec, however libxml normalizes the attribute to the following

A = #x0D #x0D A #x0A #x20 #x0A B.


This seems to be a bug.


Also suppose I have a document with an attribute value consisting of an entity reference defined in a DTD, normalization does not happen at all for the replacement text of the entity reference.


For instance I have an xml doc which has the following element, attr1 is of type other than CDATA

<element attr1=”&normalize;  &normalize;  &normalize;“>

Two Spaces between entity ref’s


<!ENTITY normalize " test&norm; "> ----Single space before and after entity text

<!ENTITY norm     “hi”>

So the normalized value for the above should be

Attr1 = “testhi testhi testhi”, Single space only between replacement text

Libxml however returns attr1=” testhi   testhi   testhi ”.---Single space after the beginning Quote and before the ending quote. Three spaces between entity replacement text.

 That’s it is not getting normalized properly, this too seems to be a bug…..The only doubt I have is whether entity replacement text is exempt from normalization? The spec however does not say so anywhere.



Ashwin Sinha


Attachment: AttrValNorm.txt
Description: Text document

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]