[xml] Patch suggestion for "fixing" 10 MB limit when using xmlNewTextWriterDoc



Hi,
I am one of those who have been bit by the 10 MB limit when building an XML using xmlNewTextWriterDoc as constructor the xmlTextWriter.

In order to work around the problem I have forked libxml2 and added two functions to xmlwriter.c that enables the user to get and set the parser options used in the underlying parser. My reason for doing so was that it seemed to me as the most non-destructive way of passing XML_PARSER_HUGE to the underlying parser context.

The functions that I have added are xmlTextWriterSetParserOptions and xmlTextWriterGetParserOption (to get options set other places if necessary. The code is available on GitHub on https://github.com/hvatum/libxml2 and I would be very happy to create a pull request if you think that this is a viable solution.

I struggled a bit with the double free that occurs when dictNames == 1 as commented on xmlwriter.c:447. This is currently handled by resetting dictNames after setting the options, but I believe that the real problem is that the doc->dict and ctxt->dict points to the same structure when dictNames == 1, and later when you free the writer and the doc, the dict is free'd twice. See SAX2.c:1029.



/*
 * Example code for creating xmlTextWriter capable of creating larger than 10 MB files.
 *
 * It might be unnecessary to get the options first, as they seems to be 0 by default.
 *
 */

    xmlDocPtr doc;
    xmlTextWriterPtr writer = xmlNewTextWriterDoc(&doc, 0);
    if (writer == NULL) {
        puts("Error while creating the XML writer");
    } else {
        int options = xmlTextWriterGetParserOptions(writer);
        xmlTextWriterSetParserOptions(writer, XML_PARSE_HUGE | options);
    }



Best regards,
Stian Hvatum


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]