gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] "semantic" diff/merge tools


From: Thomas Lord
Subject: [Gnu-arch-users] "semantic" diff/merge tools
Date: Tue, 11 Oct 2005 09:03:52 -0700

Ludovic writes:

> Tools such as OpenOffice.org Writer include versioning tools
> specifically tailored for their document format[0].  This allows users
> to view ``semantic diffs'', i.e. changes to the document itself, not 
> to its underlying representation.

> Similarly, Lisp-like languages could greatly benefit from a tailored
> diff format, an ``sexp-diff''.  According to Google, it seems that
> MzScheme already has an implementation of this.

(BTW, thanks for the links you posted earlier, Ludovic.)

I think you have to separate apples from oranges there.

I'm not familiar with the MzScheme features but, in general,
sexp-diff tools are not in any serious sense "semantic diffs".

Sexps denote abstract trees -- which is a semantics of a sort
but a very weak one.   From the diff perspective, a bytestream
denotes a list of lines, each a list of characters -- a semantics
of comparable complexity.

I would expect sexp-diff to be a generic tree differencer.  Similary
for sexp merge tools.   Such a differencer has no non-coincidental
relationship to application semantics.   For example, if the sexp
is Scheme source, merging won't reliably generate syntactically valid
code from syntactically valid inputs;  if the sexp is a saved 
word-processing document, its merged form may wind up being nonsense
from the perspective of the word processor.

In contrast, a semantically oriented application-specific diff/merge
tool for a word processor document isn't performing generic operations
on trees.   It would know, for example, not to try to recursively
compare the contents of a tree node representing a text paragraph 
to a tree node representing, say, a diagram.   It's the difficulty
and expense of writing such rules, multiplied by the number of 
present and future applications for which they are desirable, that
I object too and that lead me to propose a family of data file formats
that can make good use of generic diff/merge tools.

So in summary: sexp diff/merge tools are closer to diff, the only
difference being that the former operate on trees while the latter
operates on arrays of lines (a subset of trees).   An application
specific merge tool might also happen to be using trees as the 
data model it operates on, but is adding non-generic smarts to the
algorithm.

A "best of both worlds" approach is to use data file formats that
can make good use of generic merge tools, but then permit (rather
than "require") applications to provide non-generic alternatives.

For example, if the data files in question are source code, the
semantically richer tools might diff/merge not per-file but 
per-function, better handling the case where functions move between
files but keep their same name.   If nobody ever got around to 
writing that specialized tool, the generic per-file merging tools
would work fine -- if somebody did, then programmer's lives might
be even better than "fine".

-t






reply via email to

[Prev in Thread] Current Thread [Next in Thread]