Re: [xml] Normalization Query

From: Ashwin <ashwins huawei com>
To: veillard redhat com
Cc: xml gnome org, ranjit huawei com
Subject: Re: [xml] Normalization Query
Date: Wed, 19 Mar 2008 10:15:15 +0530

> For the following attribute what will be the normalized value, the

> attribute is of type NMTOKENS

> <doc a=" x  y "></doc>

> Will it be

> A=x y

> Or

> A= x y

> The answer is there

> http://www.w3.org/TR/REC-xml/#AVNormalize

I have gone through this but the confusion still persists... To illustrate I will present certain examples.

Case 1.

<!DOCTYPE doc [

<!ATTLIST doc a1 NMTOKENS "1 2">

<!ELEMENT doc (#PCDATA)>

In the above according to my understanding a1 should be normalized to

A1=”1 2”

Libxml is returning,

A1=”1 2”, ie an extra space

Case 2.

<!DOCTYPE doc [

<!ELEMENT doc (#PCDATA)>

<!ATTLIST doc a NMTOKENS #IMPLIED>

Here the spec gives a clear example where

A = "A

B
"

And if a is nmtoken

A = #xD #xD A #xA #xA B #xD #xA

Which is similar to case 2 in all respects except that characters referenced here are 0xd and 0xa which need not be normalized, only 0x20 needs to normalized….So I guess

A=”x y”

Case 3:

<?xml version='1.0' standalone='yes'?>

<!DOCTYPE attributes [

<!ELEMENT attributes EMPTY>

<!ATTLIST attributes

nmtoken NMTOKEN #IMPLIED

nmtokens NMTOKENS #IMPLIED

<!ENTITY ent " entity&recursive; ">

<!ENTITY recursive "reference">

<attributes

nmtoken = " &ent; &ent; &ent; "

nmtokens = " Test
 this  normalization "

Here nmtoken’s normalized value according to spec should consist of first acting on the unnormalized value in case of an entity reference by recursively applying algo mentioned in 3.3 to the replacement text, and once that is done normalize it again since the type is not CDATA….

So,

Nmtoken=”entityreference entityreference entityreference”

Nmtokens=”Test0xd0xa this normalization”

Libxml gives

Nmtoken=” entityreference entityreference entityreference ”

Nmtokens=”Test0xd0xa this normalization” //Extra space between this normalization.

The confusion is exarcebated by the fact that Java based parsers are doing Normalization which returns values which I have mentioned that are contrary to what is being returned by libxml….

I do not know whether I am interpreting the spec wrongly, so any clarifications regarding the same would be extremely welcome.

Thanks!!!

Regards

Ashwin

Follow-Ups:
- Re: [xml] Normalization Query
  - From: Daniel Veillard

References:
- Re: [xml] Normalization Query
  - From: Daniel Veillard

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]