Re: [feature request] tree iteration order option



On 30 November 2014 at 08:22, Karl-Philipp Richter
<richter richtercloud de> wrote:
Hi,
In the meantime (after a long disappointing search for better
interactive merge tools - `meld` really is the best :) ) I found out
further implications of the search order and suggest a review this time
in conjunction with optional lazy tree expansion:

`meld` (3.12.0) wastes huge amounts of memory when used as directory
merge tool because it reads all files in a directory which one might
just want to copy to the other side completely and without reflecting
its content. Imagine an invocation of `meld a b` where `a` and `b` are
directories with `1` (2 files, 3KB), `2` (5M files, 700 GB) and `3` (3
files, 4KB) in `a` and `1` (5 files, 2KB) and `3` (3 files, 3KB) in `b`.
What happens? `meld` crashes after allocating > 10 GB RAM because it
reads `2` recursively before allowing comparison of  of the two version
of `3`.

The recursiveness isn't a problem here, however. Each path pair (i.e.,
a/1/whatever vs b/1/whatever) is compared as we go. My guess is that
Meld is either crashing because the directory as a whole is too big
(because we read the whole directory entry into memory and insert the
resulting entries into the tree store) or because a single file in
that directory is too big (because we read whole files into memory, if
shallow comparison isn't enabled).

I thinks it's a pity that the application (again, it's really
great :) ) doesn't cover this use case and it would be a great improvement.

I agree that we should handle this better. Lazy tree expansion would
indeed help in this particular case, but I don't know whether it's
worth the effort. A partial fix would also be to use less memory
storing the trees, as far as we can.

GIO has some enumeration helpers for listing directories, which may
help reduce the interim memory. Of course, approximately none of our
folder comparison actually uses GIO at the moment, but porting to that
is definitely something that would be nice to do. There's also the
question of how big the actual GtkTreeStore is; that may well be the
bottleneck.

Allowing comparison of `3` before `2` is read would be great already,

We can't sanely do that. Lazy expansion, as you suggest, is the only
sane solution here, and I'm not sure that it's good default behaviour.

even better would be optional(!) lazy expansion of the tree, e.g. of the
very large `2` causing enormous swapping.

I don't think this is something we can make optional; we can't expect
the user to know ahead of time that they need it. The only way I think
we could add this is if we detected enormous subdirectories and
automatically collapsed them for later lazy expansion.

I understood the point that changing the tree iteration order is
difficult to change.

The problem isn't so much that the iteration order is difficult to
change (though it is). The problem is the extra impact on the rest of
the (already slightly fragile) code and the comparison UI. I
understand the problem you're having, and it would be cool to fix.
However, if people are interested in fixing directory comparison, I'd
suggest starting on more fundamental tasks like GIO + async
conversion, and using GtkTreeModelFilter properly (which I think will
involve a complete tree model rework).

cheers,
Kai


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]