Re: Caching merged diffs



> In Differ, merged diffs are repeatedly calculated whenever we iterate
> over changes. The merged versions of diffs can theoretically differ,
> as the underlying text is passed in on each call. However, this text
> always ends up being the raw file text, and we always update our diffs
> whenever this text changes. As such, we can cache the merged version
> of our diffs by disallowing the effectively unused texts argument.
>
> I'm attaching two patches, the first of which implements this caching
> of merged diffs, and the second of which provides a short-circuit for
> two-way comparisons. On two-way diffs, these patches provide a very
> slight speed improvement; on three-way diffs, the speed-up is easily
> measurable -- around 5% in a few quick tests based on scrolling
> through a three-way source file comparison.

Thats a good idea. I've semi-deliberately not changed any of the diffing
code in ages because I've always intended to rewrite it with an asynchronous
interface. I think this is the only way to get acceptable performance on
large files/large change blocks.

It's reasonable then to put the diff computation & most of the i/o in a
subprocess, which would make the ui much more responsive.

Stephen.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]