Re: [xml] Proposal to move mutation of HTML boolean attribute values out of the parser

On Wed, Mar 03, 2010 at 09:22:52AM -0500, Joshua Marantz wrote:
In I reported what looked
like a bug where a tag like:

  <option selected>

would be transformed, in the parser to

  <option selected="selected">

This is consistent with but that spec also

   Authors should be aware that many user agents *only* recognize the
minimized form of boolean attributes and not the full form.

By making this transformation in the parser, it is not possible to use
libxml2 to process HTML without potentially breaking behavior in some

Currently this transformation is implemented in HTMLparser.c in the static
function htmlParseAttribute, based on htmlIsBooleanAttr(name) .  Daniel
Veillard explains that some downstream tools expect this transformation to
be done. However, I would like to propose that this transformation be moved
out of the parser and done in a later phase.  Daniel suggested "the SAX2.c
module building the tree". This would be OK for my purposes, as I am using
my own SAX bindings. and not relying on the tree-building code.

So I'm proposing this change to see if there are objections.

  Okay, I just did this as I think this makes sense overall. Change is
in git head.


Daniel Veillard      | libxml Gnome XML XSLT toolkit
daniel veillard com  | Rpmfind RPM search engine | virtualization library

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]