Re: [xml] XML/HTML Mixed mode parsing

From: GPN <gpn libxml gmail com>
To: veillard redhat com
Cc: xml gnome org
Subject: Re: [xml] XML/HTML Mixed mode parsing
Date: Mon, 26 Sep 2005 20:30:13 +0530

Daniel Veillard wrote:

On Mon, Sep 26, 2005 at 07:42:40PM +0530, GPN wrote:

Hello,
Today's web page designs, include both xml and html content,
most popularly xml content embedded in to html pages.

My questions are -
1. In order to handle both html and xml, it is better to use
the xml...() API's rather than the html...() API's.
Is this correct?

either it is a complete XML document or not. If this isthen use an XML parser. There is no intermediate or compound

level at the spec level.

2. Handling errors encountered while parsing the content.
I think (or assuming) that html errors are to be ignored,

no.

and hence most browsers do not complain about a page even
if it has errors. (This can be turned on though, but the
page display does not stop if there was an error).



  right that's how browser interpret HTML 4.x based on SGML with

an text/html Mime type. If there is an XML mime type they mustuse a real XML parser and fail on fatal errors.

I am seeing if there is a viable solution for this. I need to parse
html pages, which will have xml content.
a) If I use an XML parser, then the parsing process will stop
even there was an error in html tags.
b) If I use a html parser, then the tags/atributes will be converted
to lower case (breaking XML rules).

Best Regards,
GPN

Follow-Ups:
- Re: [xml] XML/HTML Mixed mode parsing
  - From: Daniel Veillard

References:
- [xml] XML/HTML Mixed mode parsing
  - From: GPN
- Re: [xml] XML/HTML Mixed mode parsing
  - From: Daniel Veillard

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]