Re: [xml] Runtime parser limit for maximum size of text nodes



On 21/06/2017 23:41, Daniel Veillard wrote:
   I see only one person asking for this. Like any change to the
structure (even adding at the end) this has
potential risks. IMHO not worth the risk. If the person has such a
specific need he can simply recompile
libxml2 with a different value of the constant for that piece of code.

I wouldn't base the decision on the number of complaints. If users knew what XML_PARSE_HUGE really means, there would be many more complaints. The main problem is that enabling XML_PARSE_HUGE makes users vulnerable to the billion laughs attack [1]. But if you want to support text nodes larger than 10 MB, there's no way around it. And no, recompiling libxml2 with a different constant isn't an option for most users or downstream projects.

So downstream projects end up enabling the flag blindly: WebKit [2], Blink [3], PHP's SOAP server [4], Inkscape [5], and many others [6]. None of these projects seem to be aware of the consequences.

The other main reason why users enable XML_PARSE_HUGE is the xmlParserMaxDepth limit. This limit can be relaxed if we make xmlParseElement/xmlParseContent non-recursive. IIUC, the push parser already uses a non-recursive approach and the infrastructure to store the list of current nodes on the heap is already there. So this shouldn't be too hard.

Once these changes are made, we can advise users to disable XML_PARSE_HUGE.

I also don't see the risk in appending items to struct xmlParserCtxt. The struct is supposed to only be allocated within libxml2, so the struct size should never be compiled into client code. The last time the struct was changed was in 2013 for the 2.9.1 release [7]. Did this cause any problems? Why was the change deemed safe back then?

There's of course another solution: Simply disable the max text length limit. (Actually set it to around 2 GB to avoid integer overflows.) I don't see much value in limiting the size of text nodes. It makes more sense to limit the total size of an XML document which users can do easily.

Nick


[1] https://en.wikipedia.org/wiki/Billion_laughs
[2] http://git.webkit.org/?p=WebKit.git;a=commitdiff;h=c74717d5a176f74b0c77ff8266b272903ad7297d [3] https://chromium.googlesource.com/chromium/blink/+/a939e6184a192e91b0088052269554c8866dacad
[4] https://github.com/php/php-src/commit/40c60b8212b8ab18fd5bf9a426f99e42d9908f8e
[5] https://gitlab.com/inkscape/inkscape/commit/112f963fb12a941762c828dfd1690a61771516af
[6] https://encrypted.google.com/search?num=100&q="commit"+"xml_parse_huge";
[7] https://git.gnome.org/browse/libxml2/commit/?id=23f05e0


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]