Re: Folder Comparison with Percentage Similarity?



On 29 September 2017 at 01:59, Alan Halls <alanjhalls gmail com> wrote:
Thanks Jag! I will certainly look into Levenshtein!

I found this tool here (https://www.safe-corp.com/products_codematch.htm)
but it costs up to $400/MB
(https://www.safe-corp.com/documents/CodeSuite%20pricing.pdf) and seemed
like something Meld would be perfect for with minimal effort, and it seemed
like Meld could attract a whole new group of power users, and maybe even
some with some funding behind them to improve Meld.

I have a .NET programmer part time that is coming by this afternoon that I
may have look at extracting those stats - but not sure how realistic it is
as an afternoon project for someone not familiar with the code base.

I would recommend against this. I realise that it might *look* like
Meld has these sort of stats available or at least close to hand, but
it doesn't. There are all sorts of complications here around things
like early exit logic for files that are found to have differed,
default filters, handling whitespace differences... it's just not
designed for this job.

As someone else suggested, I would personally start by stringing
together unix tools to get a rough count of identical files and line
counts, and then, if you want to, use diff/Meld/something and have a
human look at important files that differ to see how significant the
differences are.

Also, I should provide the usual disclaimers that Meld isn't designed
for this kind of work, and... well... you can read the liability
disclaimers in the GPL.

cheers,
Kai


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]