PrestoSoft Blog :: Home

Sunday, November 27, 2022

ExamDiff Pro 14.0: Matching File Structures for Comparison

ExamDiff Pro 14.0 will introduce a new advanced comparison option: the ability to match file structures (functions, classes, etc.) for comparison. This is a fairly advanced feature that can result in more clear comparison results for programmers.

Like the scope bar introduced in ExamDiff Pro 13.0, structure matching is powered by the open-source TreeSitter library, and will be available for all major programming language doctypes defined in ExamDiff Pro, including C++, C#, Java, JavaScript, and more.

Let's illustrate this feature by showing a couple of examples of situations in which it can be useful.

To start with, here's ExamDiff Pro comparing the before-and-after of simple refactor of a JavaScript function:

This diff looks ugly and doesn't do a good job of capturing what happened — that is, some functionality from the generateIDForObject() function got factored out into a separate function. The problem is that the closing brace of the generateIDForObject() function in the left file is getting matched to the wrong closing brace in the second file. Because there are not a lot of completely identical lines between the two files, ExamDiff Pro's comparison algorithm doesn't have a good way to decide which closing brace in the second file to match to the one in the first file, and it ends up making a match that results in a confusing diff.

Ideally, we would like to match the closing braces of the generateIDForObject() function in each file together. We can do this ourselves using the manual synchronization feature to link these lines together manually. Now the diff looks much clearer:

But drawing links between lines ourselves kind of defeats the purpose of using a file comparison tool. Is there any way that ExamDiff Pro could automatically determine that these lines naturally should be linked together, based on the fact that they correspond to the boundaries of the same function?

This is where the Match file structures for comparison feature comes in. We can open up Options | Text Comparison | Advanced and check the Match file structures for comparison box:

Then re-compare these files, and voila! ExamDiff Pro is able to give us exactly what we want, automatically:

So how does this work? Basically ExamDiff Pro does the same file structure parsing that it does for generating the scope bar, and uses its knowledge of which functions, classes, and other structures correspond to which between the two files to draw "invisible" links between corresponding starting and ending lines of each structure, in a similar way to how fuzzy line matching works. It's as though we went through the files and drew links at the start and end of each matching function, except ExamDiff Pro was nice enough to do all that work for us instead.

In fact, we can even visualize these "invisible" links, using the new "Show fuzzy/structure links" option under Options | Display:

Turning this option on, we can now see structure links (as well as fuzzy-matching links) between lines in the splitter, indicated by a lighter color than manual synchronization links:

Let's try a slightly more complicated example, this time showing a refactor involving overloaded C++ functions. Here's what it looks like without the Match file structures for comparison option:

This diff look a little off and doesn't really indicate what happened in the refactor — namely, one overload of GetNavigatableDiffCount() got deleted and one overload got modified. We would expect to see the GetNavigatableDiffCount(BOOL bUnresolvedConflictsOnly) method definition in the second file be matched with the GetNavigatableDiffCount(BOOL bUnresolvedConflictsOnly, int nPass) definition in the first file, but instead it's being matched with the GetNavigatableDiffCount() definition line. Without structural matching, the best ExamDiff Pro can do in terms of matching heuristically is fuzzy matching individual lines. This often gives the expected result, but here it doesn't, because the GetNavigatableDiffCount(BOOL bUnresolvedConflictsOnly, int nPass) method definition is broken into multiple lines in the first file but is a single line in the second file, so fuzzy matching can't tell that these definitions "belong together".

Fortunately for us, matching file structures comes to the rescue again! If we enable Match file structures for comparison, ExamDiff Pro is able to correctly deduce which function definitions are a closer match and links these functions' start and end lines accordingly:

Labels: , ,