RE: [xml] HTML parsing problem (choking on embedded HTML tags) still exists for me

From: "Cyrill Osterwalder" <cyrill osterwalder seclutions com>
To: "Bruce Miller" <bruce miller nist gov>, <xml gnome org>
Cc: Cyrill Osterwalder <cyrill osterwalder seclutions com>
Subject: RE: [xml] HTML parsing problem (choking on embedded HTML tags) still exists for me
Date: Fri, 27 May 2005 17:26:40 +0200


Hi Bruce

Thanks for your insights. I'm aware that Daniel really understands and
enforces even the nasty details of the specs (as opposed to me). That is why
libxml is actually what the world wants.

But rather than ending with </script> (or </style>), they end
at the first </[a-zA-Z]
See  
http://www.w3.org/TR/html4/appendix/notes.html#notes-specifying-data
So, according to the spec, your example is illegal; it should contain
<\/HEAD> and <\/HTML>


That is great input indeed! Thanks. 

There is no sense in trying to process illegal HTML.

Cyrill

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]