Re: Idea: How about logical lines?



On 11/05/2016 05:42 AM, andré wrote:
Per Dalgas Jakobsen a écrit :
Now I'm looking for a way to let Meld ignore formatting of the code,
...
Is it a good idea to let Meld (and other diffs) find changes based on
logical lines as an alternative to physical lines? Like a
switch/preference to use logical lines instead (with options for
line-continuation and line-end).
Have I overlooked some important considerations here?

Interesting idea, but wouldn't meld have to understand the structure of the language coded ?

It seems to me that it would be difficult to code such comparisons, but I could be mistaken. Breaking out the logical lines would be one step, but for the comparison to be useful, it would also have to locate the logical lines in the initial (physical line) file.

If one is comparing different changes to the same file, I wouldn't expect that a logical line comparison would produce vastly different results than a physical line comparison.

Well, Meld does not have to understand the code structure as such, only how a logical line is defined (for the language in question). I believe that having Meld ignoring comments by regex-filtering, the same could be done for logical lines.

First, if one is interested in comparing source code functionality, one is almost certainly not interested in changes in whitespaces or comments; so comments could be filtered out first to simplify the following. An regex could define a "line-continuation", e.g. "\\ *$" (backslash at the end of a line, with optional spaces in between). Logical line ending could be defined as "; *$" (semicolon at the end of a line, with optional spaces in between).

Then in the internal search, line-continuation is just a whitespace, CR/LF is whitespace unless it matches the Logical line ending.

I think these two regex will allow most languages to be diffed on logical lines (ignoring code formatting). 1) Languages which have logical lines ending at CR/LF, but can break lines with continuation character will use: Line_continuation="\\ *$" and Logical_EOL="$" 2) Languages which have no line-continuation concept (all whitespaces can break), will use: Line_continuation="" and Logical_EOL="; *$"
3) Languages with both, can use both :)

I know that having logical lines ending with a non CR/LF character allows for multiple logical lines in one physical line, and if Melds internal diffing is based on physical lines (GUI and all) which I suspect, it wont be able to ignore formatting where multiple logical lines is combined/split into one/multiple physical lines, but I don't think that would be a big issue in most cases.

~Per



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]