[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [xml] Changes which might be required "Not converting names to lower-case in HTML parsing"



On Wed, Sep 28, 2005 at 07:25:52PM +0530, GPN wrote:
> Hello,
> I am working off release 2.6.22, and I am proposing the
> following changes to the code.
> - A new function xmlStrcaseEqual() might be required in
>   xmlstring.c, which can check if the current character
>   being parsed is between 'A' and 'Z', and if so compares
>   using casemap array as is done in xmlStrcasecmp().

  not needed, just !xmlStrcasecmp() the API is too large already

> - In htmlParseName(), the condition which checks if the
>   current character is upper-case, and which transforms
>   it needs to be removed. Name can be stored as it is.

  no. That would have to be conditionalized depending on a special
parsing flag option. There is also  a number of tables indexed by
the lowercase name and that will need to be preserved

> - In other parts of the code (only in HTMLparser.c), the
>   comparsions using xmlStrEqual() for names, need to be
>   replaced by xmlStrcaseEqual().

  I.e. makes a lot of costly calls instead of one costly and a number
of cheap ones, I disagree with this approach.

> I am assuming that pretty much all HTML related functionality
> is contained within HTMLparser.c, and the core xml functions
> need not change to accomodate this enhancement.

  This enhancement will have to be conditional to a parsing option.
Using the HTML parser to parse XML is basically wrong and I don't
want that to be a default behaviour of the HTML parser.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]