Subject: Re: [xml] Improper Type Returned For Repeated Attribute Declaration
On Thu, Jan 10, 2008 at 01:49:42AM +0530, ashwin sinha wrote: > > On Jan 8, 2008 8:26 PM, Daniel Veillard <[1]veillard redhat com> > wrote: > > On Tue, Jan 08, 2008 at 08:02:29PM +0530, Ashwin wrote: > > > > Hi, > > > > In the attached file the attribute a1 is declared twice > in the > > internal subset. The draft (XML 1.0) says that this is ok, > however it > > is binding that the type of the attribute value should be taken > as the > > first one which occurs in the definition, in this case CDATA > occurs > > before Nmtokens therefore the type of a1 should be CDATA. > However, > > this is not the case, while parsing the type associated with > a1 is > > NMTOKENS and normalization is performed on the attribute > value > > according to the rules specified in the XML draft. However, > ideally in > > this case there should be no normalization and a1 should be > treated as > > CDATA. > > >Hum, yes there is a problem in this edge case: > > - You will notice that when saving the document xmllint will > not > > output the superfluous NMTOKENS definition in the internal > subset, > > so it seems libxml2 does not store it, which looks good > > - but as you noted the attribute value is normalized and that's > not > > okay. If one comment out the NMTOKENS definition in the > internal subset > > then the value is not normalized, which is the expected > behaviour. > >So for some reason the NMTOKENS information is not completely > dropped > > at parse time and retained and influence the parsing of the > attribute. > > I guess using a debugger and following the attribute processing > in the > > start tag SAX handler should allow to find this out quickly, I'm > on > > the road at the moment, not sure I will be able to spend time on > it quickly > > but if you can look for this it would be nice to get this fixed, > > thanks for the report, ideally the document should be stored in > the regression > > suite once the bug is fixed, > > > Hi, > The above problem occurs because on encountering the second a1 > attribute, even though its a duplicate the parser still adds it to the > hash list using xmlAddSpecialAttr, and then while retrieving using the > hash key it fetches the most recent entry of a1 from the hash > list(which happens to be the 2nd Nmtoken a1 entry.) > > I am hoping to fix the problem by adding a global flag which i > will set in case a duplicate attr value is enountered (the check for > duplicate attr value is being done in valid.c), and adding this flag > to the if check before calling the xmlAddSpecialAttr function. Is > there any potiential problem with this fix?
> yes global state is really a problem in a library, this should really > be avoided, I don't think this scheme can work well, instead the error >of duplication should be caught, andpropagated back to the place where >xmlAddSpecialAttr is called. > I will look at this probably tomorrow morning, How about adding this check in function xmlHashAddEntry3:-
If (insert != NULL) entry->payload = userdata;
This check will ensure that in case of duplicate entries the type remains the original one instead the latest one.
Regards Ashwin |