Re: [xml] Improper Type Returned For Repeated Attribute Declaration

On Jan 8, 2008 8:26 PM, Daniel Veillard <veillard redhat com> wrote:
On Tue, Jan 08, 2008 at 08:02:29PM +0530, Ashwin wrote:
>    Hi,
>       In  the  attached  file  the  attribute a1 is declared twice in the
>    internal  subset. The draft (XML 1.0) says that this is ok, however it
>    is binding that the type of the attribute value should be taken as the
>    first  one  which  occurs in the definition, in this case CDATA occurs
>    before  Nmtokens  therefore  the  type of a1 should be CDATA. However,
>    this  is  not  the  case, while parsing the type associated with a1 is
>    NMTOKENS  and  normalization  is  performed  on  the  attribute  value
>    according to the rules specified in the XML draft. However, ideally in
>    this case there should be no normalization and a1 should be treated as
>    CDATA.

  >Hum, yes there is a problem in this edge case:
  > - You will notice that when saving the document xmllint will not
  >  output the superfluous NMTOKENS definition in the internal subset,
  >   so it seems libxml2 does not store it, which looks good
  > - but as you noted the attribute value is normalized and that's not
  >   okay. If one comment out the NMTOKENS definition in the internal subset
  >  then the value is not normalized, which is the expected behaviour.

  >So for some reason the NMTOKENS information is not completely dropped
> at parse time and retained and influence the parsing of the attribute.
> I guess using a debugger and following the attribute processing in the
> start tag SAX handler should allow to find this out quickly, I'm on
> the road at the moment, not sure I will be able to spend time on it quickly
> but if you can look for this it would be nice to get this fixed,

> thanks for the report, ideally the document should be stored in the regression
> suite once the bug is fixed,
     The above problem occurs because on encountering the second a1 attribute, even though its a duplicate the parser still adds it to the hash list using xmlAddSpecialAttr, and then while retrieving using the hash key it fetches the most recent entry of a1 from the hash list(which happens to be the 2nd Nmtoken a1 entry.)
    I am hoping to fix the problem by adding a global flag which i will set in case a duplicate attr value is enountered (the check for duplicate attr value is being done in valid.c), and adding this flag to the if check before calling the xmlAddSpecialAttr function. Is there any potiential problem with this fix?

I will be applying this fix, and if it works out I will send the patch.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]