Re: [xml] Apparently incorrect paragraph wrapping in HTML parser

From: iSteve <isteve deadcd org>
To: xml gnome org
Subject: Re: [xml] Apparently incorrect paragraph wrapping in HTML parser
Date: Thu, 12 Jan 2006 14:35:10 +0100

  Yes, thanks ! That sounds the right approach to me, I would just turn
merge that with a new htmlParserOption HTML_PARSE_STRICT, which could be
either passed by the user to maintain the current behaviour or activated by
default when the DOCTYPE is read if it happen to be a Strict HTML one.

Yes, checking the DTD is indeed an option; though I'm not sure how itwould handle case in which I link a DTD myself?

Eg.:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN""http://very.silly/html401-like/but/not/exactly/strict.dtd";>

Anyway, I do not see any reason why parser should mess with the documentin first place; it's supposed to parse it, not alter it deliberatelyaccording to what it thinks that may be the right solution. Couldsomeone please explain me why to alter the document?

And please, do not say "to be compliant with standards", becausestandards to my best knowledge do not require the parser to "fix" thedocument (though I may be wrong, I doubt standards would require such athing) by adding tags in case it's not considered correct.


 -- iSteve

PS.: The <p> tag injection is not correct anyway. "<img>" tag is inline,yet, not wrapped into <p>. Still want to keep it?


For details, see: http://www.w3.org/TR/REC-html40/sgml/dtd.html#inline

Follow-Ups:
- Re: [xml] Apparently incorrect paragraph wrapping in HTML parser
  - From: Bruce Miller

References:
- RE: [xml] Apparently incorrect paragraph wrapping in HTML parser
  - From: John.Hockaday
- Re: [xml] Apparently incorrect paragraph wrapping in HTML parser
  - From: Gary Coady
- Re: [xml] Apparently incorrect paragraph wrapping in HTML parser
  - From: Daniel Veillard

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]