Re: [xml] XML/HTML Mixed mode parsing

On Mon, Sep 26, 2005 at 07:42:40PM +0530, GPN wrote:
Today's web page designs, include both xml and html content,
most popularly xml content embedded in to html pages.

My questions are -
1. In order to handle both html and xml, it is better to use
the xml...() API's rather than the html...() API's.
Is this correct?

  either it is a complete XML document or not. If this is 
then use an XML parser. There is no intermediate or compound
level at the spec level.

2. Handling errors encountered while parsing the content.
I think (or assuming) that html errors are to be ignored,


and hence most browsers do not complain about a page even
if it has errors. (This can be turned on though, but the
page display does not stop if there was an error).

  right that's how browser interpret HTML 4.x based on SGML with
an text/html Mime type. If there is an XML mime type they must 
use a real XML parser and fail on fatal errors.

However, in XML, errors are flagged more strictly to the
user, and normally results in stopping the parsing process.
For e.g.: Opening and Ending Tag mismatch.
I was wondering if there was any compile time flag or
modifications, to turn off this behaviour.

  No because that doesn't make any sense from a spec point of view,
would render the parser non-compilant and would allow the people to
claim using interoperable XML while they don't. The behaviour of XML
parsers on errors is strictly defined, precisely people didn't want
to duplicate the HTML mess.

  There is an API to not make errors fatal. It should not be used
by default but only for recovery processing. If I get *any* feedback
that this has been abused as a default processing I will kill this 
option and I'm dead serious about this.

  If you get broken XML, get it fixed ! If you see tools or browsers
which parse broken XML report to the authors so they get it fixed, if
they don't report them to xml-dev.


Daniel Veillard      | Red Hat Desktop team
veillard redhat com  | libxml GNOME XML XSLT toolkit | Rpmfind RPM search engine

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]