Re: [xml] HTMLparser comment parsing bug and patch



On Wed, Jul 30, 2003 at 11:16:54PM +0100, Nick Kew wrote:

On Wed, 30 Jul 2003, Daniel Veillard wrote:

  Anyway William Brake came up with another patch which seems correct
and he was able to reproduce the problem and fix it.

I don't recollect seeing that.  His first patch doesn't fix it.

  yep, off list, in CVS now ...

  No surprise that no Web browser ever used a real SGML parser and

Hmmm, emacs, qweb, something-on-Mac.  Not AFAIK a long list.

  yeah, unfortunately not really the ones Web authors test again ...

that as a result we ended up with that terrible mess that is "Web HTML".

Dealing with "Web HTML" is precisely where libxml's HTMLparser is useful.

  yes I'm pragmatic in this respect. I'm usually sticking to the spec
when implementing but in that case it would be either too hard or not
very useful.

  I'm really surprized though that the HTML Working Group kept the full
support for minimization in HTML, maybe SGML had no intermediate setting
to limit it just to optional end of tags, I dunno, that's just scary...

Some of us have raised that issue with them, but their only answer begins
with an X.  The supposed reason for the SHORTTAGS is to  allow attribute
minimisation (things like <option selected> for <option selected=selected>),
but it's perfectly possible to separate that from the more troublesome
minimisations.  My practical solution is described at
<URL:http://valet.webthing.com/page/parsemode.html>.

  okay, removing attribute minimisation would be too drastic, heh

Daniel

-- 
Daniel Veillard      | Red Hat Network https://rhn.redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]