Re: [xml] Parsing tag-soup HTML

From: Nick Kew <nick webthing com>
To: veillard redhat com
Cc: xml gnome org
Subject: Re: [xml] Parsing tag-soup HTML
Date: Sun, 17 Jun 2007 15:52:28 +0100

On Sun, 17 Jun 2007 10:18:29 -0400
Daniel Veillard <veillard redhat com> wrote:

So, what do you think?  Is this something the libxml2 project
would like to see, or would you prefer to steer well clear?


  I'm not adverse to adding a new HTML parsing option for 'tag soup'
but you would have to define clearly what is the new parsing strategy
before I (and others on this list) can say yes or no to that option.
So what would the 'tag soup' parser do that the current HTML parser
does not and vice-versa ? If you could define this other than by an
accumulation of specific cases then that's probably viable, but if
it's just an ever growing list of individual preferences on a case
by case basis, this doesn't sound okay to say yes to your selection 
rather than someone else application own set.
  Makes sense ?


Thanks for the quick response.

Yes, of course I didn't expect a straight "yes" to such a vague
proposal.  My question concerned whether I should invest the time
and effort to determine the details of how this should look in the
context of HTMLparser.

I'll take your reply as a yes in principle, and dive into the code
to think it through a little more.  If it looks promising, I'll
come back to you with more concrete proposals.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

Follow-Ups:
- Re: [xml] Parsing tag-soup HTML
  - From: Daniel Veillard

References:
- [xml] Parsing tag-soup HTML
  - From: Nick Kew
- Re: [xml] Parsing tag-soup HTML
  - From: Daniel Veillard

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]