[xml] strange end-tag position (parsing html)



Hi,
I'm trying to parse bare.txt (attached, yes it is simply cnn.com). For
this purpose I'm using parse.c (also attached).
The output is output.txt (Attachment!).
If you look at bare.txt, you see a <script> block from line 826 to
line 886. Now if you look at output.txt, you see the
<script>-Tag in line 759, but the end-Tag (</script>) is in line 784;
the problem is, that this end-Tag is in the middle
of the javascript-code, which is actually bad :(
I hope, you understood what the problem is, if not, don't hesitate to
ask via the list or direct(if you want).
Thx for your help

Attachment: bare.txt
Description: Text document

Attachment: parse.c
Description: Text Data

Attachment: output.txt
Description: Text document



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]