Re: [xml] Apparently incorrect paragraph wrapping in HTML parser

From: iSteve <isteve deadcd org>
To: xml gnome org
Subject: Re: [xml] Apparently incorrect paragraph wrapping in HTML parser
Date: Mon, 09 Jan 2006 15:41:34 +0100

> I'm not sure text nodes are to be accepted directly as child of abody element


It is valid for HTML 4.01 Transitional -- which, let's be fair, is quite
 common (if not the most common) standard used for websites. You are
however right that it is not valid with HTML 4.01 Strict. In
transitional, <body> may contain inline elements, while in strict it can
not.

Maybe the paragraph appending should be turned on only in case the
standard requires it (ie. Strict), but otherwise (eg. transitional or
used-defined, or no doctype at all) turned off?

>   For div, it seems adding the <p> is superfluous
>
>   http://www.w3.org/TR/REC-html40/struct/global.html#edef-DIV

> <!ELEMENT DIV - - (%flow;)* -- generic language/stylecontainer -->


Yes, the two appended samples were in fact how I found out about the issue.

>>I do not believe that wrapping the text into paragraph (which, I
>>believe, is performed by htmlCheckParagraph()) is the best way;
> perhaps
>>setting the tag name to eg. NULL instead, or a zero-size string (as a
>
>
>   element with no name or element with empty names would break so much

> code assuming a correct that nothing could justify such a hack, sorry> !!!


True, it was just a quick hack idea for resolving the CSS application...


I believe the best solution may really be turning off the paragraph
wrapping per default, and only turning it on by checking the doctype or
perhaps by exporting a function in the API... That would resolve both
issues with all the non-strict websites, and also with user-specified
html for private usage.

-- iSteve
PS.: Is it specified anywhere in the standard that the blocks should be
wrapped into <p> by the parser? I mean, how was it brought up in first
place? I've tried searching the mailing list, but the only relevant
information I've found was that somebody has troubles with it when using
user-defined HTML elements...

Follow-Ups:
- Re: [xml] Apparently incorrect paragraph wrapping in HTML parser
  - From: Liam R E Quin

References:
- [xml] Apparently incorrect paragraph wrapping in HTML parser
  - From: iSteve
- Re: [xml] Apparently incorrect paragraph wrapping in HTML parser
  - From: Daniel Veillard

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]