Re: [xml] Apparently incorrect paragraph wrapping in HTML parser
- From: iSteve <isteve deadcd org>
- To: xml gnome org
- Subject: Re: [xml] Apparently incorrect paragraph wrapping in HTML parser
- Date: Mon, 09 Jan 2006 15:41:34 +0100
> I'm not sure text nodes are to be accepted directly as child of a
It is valid for HTML 4.01 Transitional -- which, let's be fair, is quite
common (if not the most common) standard used for websites. You are
however right that it is not valid with HTML 4.01 Strict. In
transitional, <body> may contain inline elements, while in strict it can
Maybe the paragraph appending should be turned on only in case the
standard requires it (ie. Strict), but otherwise (eg. transitional or
used-defined, or no doctype at all) turned off?
> For div, it seems adding the <p> is superfluous
> <!ELEMENT DIV - - (%flow;)* -- generic language/style
Yes, the two appended samples were in fact how I found out about the issue.
>>I do not believe that wrapping the text into paragraph (which, I
>>believe, is performed by htmlCheckParagraph()) is the best way;
>>setting the tag name to eg. NULL instead, or a zero-size string (as a
> element with no name or element with empty names would break so much
> code assuming a correct that nothing could justify such a hack, sorry
True, it was just a quick hack idea for resolving the CSS application...
I believe the best solution may really be turning off the paragraph
wrapping per default, and only turning it on by checking the doctype or
perhaps by exporting a function in the API... That would resolve both
issues with all the non-strict websites, and also with user-specified
html for private usage.
PS.: Is it specified anywhere in the standard that the blocks should be
wrapped into <p> by the parser? I mean, how was it brought up in first
place? I've tried searching the mailing list, but the only relevant
information I've found was that somebody has troubles with it when using
user-defined HTML elements...
] [Thread Prev