Re: [xml] Cleaning the Web - Implementing HTML 5 parsing in libxml2
- From: Stefan Behnel <stefan_ml behnel de>
- To: Karl Dubost <karl w3 org>
- Cc: xml gnome org, "Michael \(tm\) Smith" <mike w3 org>, Nick Kew <nick webthing com>
- Subject: Re: [xml] Cleaning the Web - Implementing HTML 5 parsing in libxml2
- Date: Mon, 18 Aug 2008 08:51:48 +0200
Hi,
Karl Dubost wrote:
Nick Kew weighed in and proposed that we should target [6]libxml
which includes an HTML parser and is already supported by Apache
server and many other tools.
[6] http://xmlsoft.org/html/libxml-HTMLparser.html
From here it would be interesting to implement HTML 5 parsing
algorithm into libxml2. It would benefit the community as large.
Have you tried joining forces with the people who started the C implementation
of html5lib? Maybe they have ideas to contribute or (partially) working code
that you can look at. It may even happen that you get them convinced of the
project.
In any case, having working implementations in Python and Java should get you
a lot closer to your goal by looking under the hood.
Stefan
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]