[xml] HTML Parser: Choking on quoted HTML tags in javascript?




When using the HTML parser on documents that include HTML tags inside quoted
javascript strings, it seems that the HTML parser processes the tags and
terminates the parsing. Is that "by design" or is this a bug?

Example:
If the parser processes the following HTML page it seems to interpret the
quoted "</HEAD>" end tag (at **) and inserts the assumed to be missing
"</script></head><body>" tags. Same thing with the subsequent quoted
"</HTML>" tag (at ***).

<html>
<head>
<title>TEST LIBXML HTML PARSER</title>
<script LANGUAGE="JavaScript">
function preview(textarea_obj) {
        var txt = get_textarea(textarea_obj);
        var pop_win = window.open("", "win", "width=400,height=250");
        pop_win.document.open("text/html", "replace");
        pop_win.document.write("<HTML>");
        pop_win.document.write("<HEAD>");
        pop_win.document.write("<title>Post Previewer</title>");
        pop_win.document.write("<link rel=stylesheet type=text/css
href=default.css>");
**      pop_win.document.write("</HEAD>");
        pop_win.document.write(txt);
***     pop_win.document.write("</HTML>");
        pop_win.focus();
}
</script>
</head>
<body>
...
...



Anybody around here who is experienced in using the HTML parser on documents
similar to the above example? Maybe I'm just seeing things wrong.

Best regards,

Cyrill



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]