RE: [xml] Apparently incorrect paragraph wrapping in HTML parser

Hi All,

I personally believe that it should be based on the DTD being used for the

I use XHTML Strict (-//W3C//DTD XHTML 1.0 Strict//EN) and I would expect any
conversion from XML to XHTML to make an XML document that is valid against
the DTD.  If libxml2 does what you want then this will *not* be the case and
hence all my XHTML would be invalid.

In the future people will have problems with their XHTML if they do not
consider using the strict version.  The semantic web and machine to machine
communications will need to depend on the documents being as compliant as
possible to the standard.  

"-//W3C//DTD XHTML 1.0 Transitional//EN" is supposed to be for "transitional"
use while one is going from HTML 4.0 to XHTML.  I believe that
XHTML1.0-Strict is the expected standard until it is replaced by the W3C XML
Schema version.  For this reason I believe Libxml2 should automatically
provide XHTML1.0-Strict.  If not then should libxml2 be creating HTML 1.0?

If people want to have a less strict XHTML output then there should be an
option of generating XHTML-Transitional or another HTML standard.

No matter what the output, the XHTML should have a prolog that includes an
XML declaration (if it is XHTML) and a valid DOCTYPE declaration.  The rest
of the XML document should be valid according to the DTD of the stated

<?xml version="1.0" encoding="iso-8859-1" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
<html xmlns=""; lang="en" xml:lang="en">
<title>Some content</title>
Some more content.

This will ensure that people can validate the XHTML according to the
intentions of the author.  It will also ensure that the output will be
exactly as intended by the standard and not depend on the interpretation of
the browser being used.



-----Original Message-----
From: xml-bounces gnome org [mailto:xml-bounces gnome org] On 
Behalf Of iSteve
Sent: Thursday, 12 January 2006 4:59 AM
To: xml gnome org
Subject: Re: [xml] Apparently incorrect paragraph wrapping in 
HTML parser

Daniel Veillard said (on IRC, though not on this mailing 
list) that he's 
waiting for feedback about how to fix this issue.

I for myself have to say I'd remove it completely; I do not 
think it is 
reasonable to have the document altered by the parser in first place; 
I've found some more issues with it in the mailing list, too. 

Since DV didn't ask himself -- it is after all me who wants this bug 
fixed -- I have to ask: what do all of you think about it? Any 
suggestions how it should be resolved?

  -- iSteve
xml mailing list, project page
xml gnome org

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]