RE: [xml] Apparently incorrect paragraph wrapping in HTML parser
- From: <John Hockaday ga gov au>
- To: <xml gnome org>
- Subject: RE: [xml] Apparently incorrect paragraph wrapping in HTML parser
- Date: Thu, 12 Jan 2006 09:47:58 +1100
Hi All,
I personally believe that it should be based on the DTD being used for the
HTML.
I use XHTML Strict (-//W3C//DTD XHTML 1.0 Strict//EN) and I would expect any
conversion from XML to XHTML to make an XML document that is valid against
the DTD. If libxml2 does what you want then this will *not* be the case and
hence all my XHTML would be invalid.
In the future people will have problems with their XHTML if they do not
consider using the strict version. The semantic web and machine to machine
communications will need to depend on the documents being as compliant as
possible to the standard.
"-//W3C//DTD XHTML 1.0 Transitional//EN" is supposed to be for "transitional"
use while one is going from HTML 4.0 to XHTML. I believe that
XHTML1.0-Strict is the expected standard until it is replaced by the W3C XML
Schema version. For this reason I believe Libxml2 should automatically
provide XHTML1.0-Strict. If not then should libxml2 be creating HTML 1.0?
;--)
If people want to have a less strict XHTML output then there should be an
option of generating XHTML-Transitional or another HTML standard.
No matter what the output, the XHTML should have a prolog that includes an
XML declaration (if it is XHTML) and a valid DOCTYPE declaration. The rest
of the XML document should be valid according to the DTD of the stated
DOCTYPE. E.G.
<?xml version="1.0" encoding="iso-8859-1" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<title>Some content</title>
<body>
Some more content.
</body>
</html>
This will ensure that people can validate the XHTML according to the
intentions of the author. It will also ensure that the output will be
exactly as intended by the standard and not depend on the interpretation of
the browser being used.
Thanks.
John
-----Original Message-----
From: xml-bounces gnome org [mailto:xml-bounces gnome org] On
Behalf Of iSteve
Sent: Thursday, 12 January 2006 4:59 AM
To: xml gnome org
Subject: Re: [xml] Apparently incorrect paragraph wrapping in
HTML parser
Daniel Veillard said (on IRC, though not on this mailing
list) that he's
waiting for feedback about how to fix this issue.
I for myself have to say I'd remove it completely; I do not
think it is
reasonable to have the document altered by the parser in first place;
I've found some more issues with it in the mailing list, too.
(http://mail.gnome.org/archives/xml/2002-October/msg00047.html)
Since DV didn't ask himself -- it is after all me who wants this bug
fixed -- I have to ask: what do all of you think about it? Any
suggestions how it should be resolved?
-- iSteve
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]