[xml] Possible loss of TEXT node with htmlParseDoc
- From: Frédéric GICQUEL <frederic gicquel easybusiness fr>
- To: xml gnome org
- Subject: [xml] Possible loss of TEXT node with htmlParseDoc
- Date: Thu, 05 Apr 2001 16:46:33 +0200
Hi,
See below the result of a document dump after
a call to the htmlParseDoc() function.
Here is an extract of source code :
...
htmlDocPtr doc;
doc = htmlParseDoc((xmlChar *)"Hello <a href=\"world\">world</a> !",
"HTML");
if (!doc)
return;
xmlDebugDumpDocument(stderr, doc);
...
HTML DOCUMENT
standalone=true
DTD(HTML), PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN, SYSTEM
http://www.w3.org/TR/REC-html40/loose.dtd
ELEMENT html
ELEMENT body
<-- where is the TEXT node "Hello" ?
ELEMENT a
ATTRIBUTE href
TEXT
content=http://world.org
TEXT
content=world
TEXT
content= !
I have this bug (?) only when the input string contains some HTML tags.
But it works if there is any HTML tag at the beginning.
...
doc = htmlParseDoc((xmlChar *)"<br>Hello <a href=\"world\">world</a> !",
"HTML");
...
HTML DOCUMENT
standalone=true
DTD(HTML), PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN, SYSTEM
http://www.w3.
org/TR/REC-html40/loose.dtd
ELEMENT html
ELEMENT body
ELEMENT br
TEXT <-- Ok
content=Hello
ELEMENT a
ATTRIBUTE href
TEXT
content=http://world.org
TEXT
content=world
TEXT
content= !
I found that problem in libxml 2.2.10 and 2.3.5 doesn't seem to fix it.
Has anyone already met that problem ?
Thanks in advance for your possible help.
Regards,
--
---------------------------------------
Frédéric Gicquel
rnr easybusiness fr
---------------------------------------
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]