[xml] Patch for HTMLparser
- From: James Bursa <bursa users sourceforge net>
- To: xml gnome org
- Subject: [xml] Patch for HTMLparser
- Date: Thu, 20 Nov 2003 21:40:23 +0000
Below is a minor patch for HTMLparser.c:
1. Handle hex character entities like ģ, ie. a capital X.
2. Skip to the end of misplaced <body> start tags. Currently any attributes
of a misplaced <body> are parsed as text and included as a <p> element in
the tree.
James
Index: HTMLparser.c
===================================================================
RCS file: /cvs/gnome/gnome-xml/HTMLparser.c,v
retrieving revision 1.167
diff -d -u -3 -r1.167 HTMLparser.c
--- HTMLparser.c 31 Oct 2003 10:36:02 -0000 1.167
+++ HTMLparser.c 20 Nov 2003 18:11:57 -0000
@@ -2880,7 +2880,7 @@
int val = 0;
if ((CUR == '&') && (NXT(1) == '#') &&
- (NXT(2) == 'x')) {
+ ((NXT(2) == 'x') || NXT(2) == 'X')) {
SKIP(3);
while (CUR != ';') {
if ((CUR >= '0') && (CUR <= '9'))
@@ -3253,6 +3253,8 @@
htmlParseErr(ctxt, XML_HTML_STRUCURE_ERROR,
"htmlParseStartTag: misplaced <body> tag\n",
name, NULL);
+ while ((IS_CHAR_CH(CUR)) && (CUR != '>'))
+ NEXT;
return;
}
}
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]