> For the following attribute what will be the normalized value, the > attribute is of type NMTOKENS > > <doc a=" x  y "></doc> > > Will it be > > A=x y > > Or > > A= x y
> The answer is there > http://www.w3.org/TR/REC-xml/#AVNormalize
I have gone through this but the confusion still persists... To illustrate I will present certain examples. Case 1. <!DOCTYPE doc [ <!ATTLIST doc a1 NMTOKENS "1 2"> <!ELEMENT doc (#PCDATA)> ]> <doc></doc> In the above according to my understanding a1 should be normalized to A1=”1 2” Libxml is returning, A1=”1 2”, ie an extra space
Case 2. <!DOCTYPE doc [ <!ELEMENT doc (#PCDATA)> <!ATTLIST doc a NMTOKENS #IMPLIED> ]> <doc a=" x  y "></doc>
Here the spec gives a clear example where A = "

A

B
" And if a is nmtoken A = #xD #xD A #xA #xA B #xD #xA Which is similar to case 2 in all respects except that characters referenced here are 0xd and 0xa which need not be normalized, only 0x20 needs to normalized….So I guess A=”x y”
Case 3: <?xml version='1.0' standalone='yes'?> <!DOCTYPE attributes [ <!ELEMENT attributes EMPTY> <!ATTLIST attributes nmtoken NMTOKEN #IMPLIED nmtokens NMTOKENS #IMPLIED > <!ENTITY ent " entity&recursive; "> <!ENTITY recursive "reference"> ]> <attributes nmtoken = " &ent; &ent; &ent; " nmtokens = " Test
 this  normalization " />
Here nmtoken’s normalized value according to spec should consist of first acting on the unnormalized value in case of an entity reference by recursively applying algo mentioned in 3.3 to the replacement text, and once that is done normalize it again since the type is not CDATA….
So, Nmtoken=”entityreference entityreference entityreference” Nmtokens=”Test0xd0xa this normalization”
Libxml gives Nmtoken=” entityreference entityreference entityreference ” Nmtokens=”Test0xd0xa this normalization” //Extra space between this normalization.
The confusion is exarcebated by the fact that Java based parsers are doing Normalization which returns values which I have mentioned that are contrary to what is being returned by libxml….
I do not know whether I am interpreting the spec wrongly, so any clarifications regarding the same would be extremely welcome.
Thanks!!!
Regards Ashwin
|