Re: [xml] Recording node info for HTML
- From: Benjamin Mularczyk <benjamin deepcode ai>
- To: xml gnome org
- Subject: Re: [xml] Recording node info for HTML
- Date: Mon, 15 Apr 2019 08:59:06 +0200
Hey,
I'm using libxml2-2.9.8.
When using libxml to parse xml I can use
ctxt->record_info = true
xmlInitNodeInfoSeq(&ctxt->node_seq);
xmlParseDocument(ctxt)
to record positions for the parsed nodes.
However, for HTML the following
ctxt->record_info = 1;
xmlInitNodeInfoSeq(&ctxt->node_seq);
htmlParseDocument(ctxt);
leads to seg fault for some (not necessarily well formed) HTML files. A minimal example would be an HTML file with content "<label></label>" which leads to a seg fault:
#0 0x0000555555695199 in xmlSAX2EndElement (ctx=0x555555975a20, name=0x55555570141e "body") at external/libxml2/libxml2-2.9.8/SAX2.c:1815
#1 0x000055555561412b in htmlAutoCloseOnEnd (ctxt=0x555555975a20) at external/libxml2/libxml2-2.9.8/HTMLparser.c:1384
#2 0x000055555561cae2 in htmlParseContentInternal (ctxt=0x555555975a20) at external/libxml2/libxml2-2.9.8/HTMLparser.c:4674
#3 0x000055555561d0da in htmlParseDocument (ctxt=0x555555975a20) at external/libxml2/libxml2-2.9.8/HTMLparser.c:4817
#4 0x000055555556f81d in ParseHTML (content="<label></label>\n", nodes=0x7fffffffd7a0, error_message=0x7fffffffd8b0) at parser/xml_parser.cpp:431
#5 0x00005555555711e6 in main (argc=2, argv=0x7fffffffdb08) at parser/xml_parser.cpp:596
Does the API for parsing HTML files support recording positions of the nodes? If so, what am I doing wrong or what can be done to prevent the seg fault?
Thank you and best regards
Ben
[Date Prev][Date Next] [Thread Prev][Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]