Re: [xml] xmlReadFile Fails Where xmlParseFile Succeeds



On Mon, Jul 27, 2015 at 02:38:23PM -0400, Paul Braman wrote:
The following bit of code fails

xmlInitParser();
xmlDocPtr maindoc = xmlReadFile("maindoc.xml", NULL, 0);
xmlDocPtr subdoc = xmlReadFile("subdoc.xml", NULL, 0);
xmlNodePtr content = xmlDocGetRootElement(subdoc);
xmlUnlinkNode(content);
xmlAddChild(xmlDocGetRootElement(maindoc), content);
xmlFreeDoc(subdoc);
xmlFreeDoc(maindoc);
xmlCleanupParser();


with a crash upon calling xmlFreeDoc(maindoc) (problem freeing memory)
where this code succeeds just fine and dandy

xmlInitParser();
xmlDocPtr maindoc = xmlParseFile("maindoc.xml");
xmlDocPtr subdoc = xmlParseFile("subdoc.xml");
xmlNodePtr content = xmlDocGetRootElement(subdoc);
xmlUnlinkNode(content);
xmlAddChild(xmlDocGetRootElement(maindoc), content);
xmlFreeDoc(subdoc);
xmlFreeDoc(maindoc);
xmlCleanupParser();


I understand I should use xmlReadFile instead of xmlParseFile. However, I
can't figure on what's different between the two that could be causing the
crash in the first block of code?

Note, I've tried with multiple versions, even 2.9.2.

Alternatively, how *should* I be structuring the code to do what I'm doing
here? (I know a call to

xmlAddChild(xmlDocGetRootElement(maindoc),
xmlCopyNode(xmlDocGetRootElement(subdoc), 1));


in place of the get/unlink/add sequence works but I'd like to understand
why the code above fails.)

  the problem is dictionaries. the Read* function boost the processing speed
by using a dictionary for all the strings in markup etc. The dictionary is
allocated to the document. when moving content you have pointers from that
part of the subdoc to subdoc dictionary which get pruned to maindoc using
a different dictionary. When trying to free the reference to the local document
dictionary is lost, libxml2 then tries to free the strings from subdoc
dictionary and that fails.
  2 ways around that:

  Option 1: disable dictionaries when parsing if you do that kind of copy
     paste between documents (XML_PARSE_NODICT option see parser.h)

  Option 2: do the tweaking so that all the files get parsed  with the same
     dictionary shared by all documents, more complex, libxslt does that for
     example, I don't have a simple document explaining it unfortunately.

Daniel

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
https://mail.gnome.org/mailman/listinfo/xml


-- 
Daniel Veillard      | Open Source and Standards, Red Hat
veillard redhat com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]