Re: [xml] Improper Type Returned For Repeated Attribute Declaration







Subject: Re: [xml] Improper Type Returned For Repeated Attribute Declaration


On Thu, Jan 10, 2008 at 01:49:42AM +0530, ashwin sinha wrote:


>    On  Jan  8,  2008  8:26  PM,  Daniel Veillard <[1]veillard redhat com>

>    wrote:


>    On Tue, Jan 08, 2008 at 08:02:29PM +0530, Ashwin wrote:

>    >

>    >    Hi,

>    >

>    >         In  the  attached  file  the  attribute a1 is declared twice

>    in the

>    >      internal   subset.  The  draft  (XML 1.0) says that this is ok,

>    however it

>    >      is binding that the type of the attribute value should be taken

>    as the

>    >      first  one  which  occurs in the definition, in this case CDATA

>    occurs

>    >      before   Nmtokens   therefore  the  type of a1 should be CDATA.

>    However,

>    >     this  is  not  the  case, while parsing the type associated with

>    a1 is

>    >      NMTOKENS  and  normalization  is  performed  on  the  attribute

>    value

>    >      according  to  the  rules  specified in the XML draft. However,

>    ideally in

>    >      this  case  there  should  be no normalization and a1 should be

>    treated as

>    >    CDATA.


>        >Hum, yes there is a problem in this edge case:

>         >  -  You will notice that when saving the document xmllint will

>      not

>         >   output  the  superfluous NMTOKENS definition in the internal

>      subset,

>        >   so it seems libxml2 does not store it, which looks good

>        > - but as you noted the attribute value is normalized and that's

>      not

>         >    okay.  If  one  comment  out the NMTOKENS definition in the

>      internal subset

>         >   then  the  value  is  not  normalized, which is the expected

>      behaviour.

>         >So  for  some reason the NMTOKENS information is not completely

>      dropped

>      >  at  parse  time  and  retained  and influence the parsing of the

>      attribute.

>      >  I  guess using a debugger and following the attribute processing

>      in the

>      >  start tag SAX handler should allow to find this out quickly, I'm

>      on

>      >  the road at the moment, not sure I will be able to spend time on

>      it quickly

>      > but if you can look for this it would be nice to get this fixed,

>      >  thanks  for the report, ideally the document should be stored in

>      the regression

>      > suite once the bug is fixed,



>       Hi,

>          The  above  problem occurs because on encountering the second a1

>    attribute, even though its a duplicate the parser still adds it to the

>    hash list using xmlAddSpecialAttr, and then while retrieving using the

>    hash  key  it  fetches  the  most  recent  entry  of  a1 from the hash

>    list(which happens to be the 2nd Nmtoken a1 entry.)


>          I  am  hoping to fix the problem by adding a global flag which i

>    will  set  in case a duplicate attr value is enountered (the check for

>    duplicate  attr  value is being done in valid.c), and adding this flag

>    to  the  if  check  before  calling the xmlAddSpecialAttr function. Is

>    there any potiential problem with this fix?


 > yes global state is really a problem in a library, this should really

> be avoided, I don't think this scheme can work well, instead the error

>of duplication should be caught, andpropagated back to the place where

>xmlAddSpecialAttr is called.

>  I will look at this probably tomorrow morning,

How about adding this check in function xmlHashAddEntry3:-


If (insert != NULL)

entry->payload = userdata;


This check will ensure that in case of duplicate entries the type remains the original one instead the latest one.




