Re: [xml] html parsing incomplete - bug?
- From: "Martin (gzlist)" <gzlist googlemail com>
- To: Stefan Behnel <stefan_ml behnel de>
- Cc: xml gnome org, Lydia Patrovic <lydia patrovic rbcmail ru>
- Subject: Re: [xml] html parsing incomplete - bug?
- Date: Tue, 13 Oct 2009 12:39:06 +0100
On 13/10/2009, Stefan Behnel <stefan_ml behnel de> wrote:
Lydia Patrovic wrote:
Note the "main&20090924_2" attribute value, which can be interpreted
as an
unterminated entity.
:) Nice little Freudian copy&paste quoting error. Here's the line from the
real 'HTML' file:
<script type="text/javascript" src="merge.php?f=main&20090924_2"></script>
Note the unescaped '&' character in the URL.
I'd have thought the embedded null at byte 532 would be the cause. Try
bytes.replace("\x00", "") before treating it as a c string. Seems to
get the document parsed pretty much as expected for me.
Martin
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]