Re: [PATCH] Speedup parsing of svn status on huge repositories

From: Kai Willadsen <kai willadsen gmail com>
To: Vasily Galkin <galkin-vv yandex ru>
Cc: meld-list <meld-list gnome org>
Subject: Re: [PATCH] Speedup parsing of svn status on huge repositories
Date: Sun, 14 Feb 2016 07:48:57 +1000

On 6 February 2016 at 06:24, Vasily Galkin <galkin-vv yandex ru> wrote:

Thanks for review!

I'm still interested in getting patch in, but I'm not quick -
it takes some time to make fun improving my python skill to get code
fast AND readable AND more selective in tags pulling.

Unfortunately the patch now is two times longer, but otherwise at least one of the 3 goals above was not 
reached.

I fixed your issues and also reformat code to 80-col limit.
Filed this as https://bugzilla.gnome.org/show_bug.cgi?id=761608


So I've read over that patch, and now I'm questioning whether the
expat parser is the right way to go here. It's an older API and is
much less widely used. For instance, I can find half a dozen guides on
using ElementTree's API in both streaming and non-streaming modes, and
very little good about how to best use expat.

The existing code is just lazily using the `ElementTree.parse()` API.
I feel like looking into using `ElementTree.iterparse()` instead would
probably give us something much easier to handle than expat will.

From your original email, we got a significant improvement in both

speed and memory usage just from using cElementTree instead of
ElementTree. I feel like we should start with that change, then see
whether iterparse will work for us (which I feel like it should). This
isn't about problems with your patch; I'm just concerned here about
the long-term maintainability of the two APIs.

cheers,
Kai

References:
- Re:[PATCH] Speedup parsing of svn status on huge repositories
  - From: Vasily Galkin

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]