Re: [xml] HTMLparser: SGML comments
- From: Daniel Veillard <veillard redhat com>
- To: Michael Day <mikeday yeslogic com>
- Cc: xml gnome org
- Subject: Re: [xml] HTMLparser: SGML comments
- Date: Mon, 14 Nov 2005 09:52:06 -0500
On Mon, Nov 14, 2005 at 06:53:44PM +1100, Michael Day wrote:
Hi Daniel,
I don't really know SGML, so such patches are welcome. I just have one
problem with the code, it calls GROW only when the end of the buffer is
detected with a NUL, I would rather have it called more preemtively to
in the loop to avoid a potential weakness in the case of multibyte chars.
I have changed the patch to call GROW in the loop each time before moving
on to the next character. (I don't know whether I should be calling SHRINK
as well, though?)
Probably not needed ...
Note also that I prefer patches than cut an paste of full routines, it
gives me the context of what was changed.
Here is a unified diff, is that the right format?
yes except it was truncated (i.e. missing the header defining the file
being patched). However if I apply it the regression tests fails
on ./test/HTML/wired.html, it seems they have a comment like
<!--TRADES->
which the HTML parser used to accept without complaining, but your code barfs
on it with:
./test/HTML/wired.html:517: HTML parser error : Comment not terminated
<!--TRADES->
<br>
<font face= "Verdana, Arial, Geneva,
^
And as a result the start elements following seems not to be seen as such.
I think your code should be modified to allow any -> to close the comment
or maybe even just '>'. At least this should be looked at, I don't think
I can commit this without further analysis of that case.
Daniel
--
Daniel Veillard | Red Hat http://redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]