[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: diff -b does not actually work with newlines

From: Denver Gingerich
Subject: Re: diff -b does not actually work with newlines
Date: Sun, 15 Jul 2007 19:45:49 -0400

Reference 1 (original message):
On 7/13/07, David Kastrup <address@hidden> wrote:

diff the following files using
diff -b a b

file a:
This is a test file
and we are testing

file b:
This is
a test file and we are testing

You get:

diff -b a b
< This is a test file
< and we are testing
> This is
> a test file and we are testing

This is quite a big nuisance since "soft newlines" are rather common
in a lot of source code languages.  Being able to keep hard newlines
(double empty lines) for comparison might be a nice option for things
like TeX input, but that is not crucial for now.

Reference 2 (most recent message):
On 7/15/07, David Kastrup <address@hidden> wrote:
"Denver Gingerich" <address@hidden> writes:

> On 7/13/07, David Kastrup <address@hidden> wrote:
>> Paul Eggert <address@hidden> writes:
>> > diff -b is working as specified.  It sounds like you want wdiff; you
>> > might try using that instead.
>> wdiff does not return suitable output or take suitable options.
> Can you provide some reasons why the wdiff output is not suitable
> for your purposes?

Because it is completely idiosyncratic and can't be fed into patch or
any other utility interpreting diffs?

Because it takes different options from diff and thus could not be
plugged into a version control system even if the above was not the

You're right.  wdiff needs an overhaul, which is the plan for the
(hopefully) near future.  wdiff will be getting a unified context mode
similar to "diff -u" as well as an output format that allows output to
be used for patching purposes along with some sort of "wpatch"
utility.  The details for these haven't been worked out yet so I'd be
happy for any suggestions.

Eventually I will be proposing patches to diff that integrate the
functionality of wdiff into diff as an extra set of command-line
parameters.  This is a more long-term goal.

> If the files are equal, disregarding whitespace, I'm not sure why
> you would even want to use the output.

Sigh.  If one compares two revisions of a file that has been through a
whitespace-wrapping editor, the files are _not_ equal disregarding

Perhaps I'm not understanding you correctly.  If a file that has been
through a whitespace-wrapping editing tool that has had no changes
made to it aside from the whitespace changes made by the editing tool,
then the resulting file will be the same as the unmodified file as
long as we disregard changes in whitespace.  Did you mean something

The whole point is to find out what _actual_ change
occured _outside_ of the whitespace changes, so that one can take that
diff and feed it into patch.  That way, one gets a version with all
the non-whitespace changes, without getting hundreds of whitespace

Agreed.  That would be ideal.

> Instead I think the return code would be more useful to you.

You really think you know better than myself what I need?

I suspect I didn't understand your situation correctly.  Apologies.

> wdiff correctly returns 0 with your test case, indicating the files
> are the same, excluding whitespace.
> Which options is wdiff lacking that would make it easier to use for
> your purposes?  We can always add them if they seem reasonable.

It just has a completely different option set from diff, and it has
completely different output.  It's just completely useless for
plugging into version control systems (systems like Subversion allow
to specify an external diff program to use.  Plugging in wdiff
immediately breaks because of unknown options).

Yes, wdiff in its current state is completely useless for version
control systems.  However, after a unified context mode has been added
to wdiff and it has been integrated into diff, the wdiff functionality
should be useful for version control systems.  Users can choose to
pass the --word-wise parameter (or whatever parameter we choose),
which will give them appropriate output that can be used with an
appropriately-modified patch tool.

Of course, further discussion will be needed to determine whether we
should break the line by line characteristic of diff with the
--word-wise parameter once we get to that point.


Note: the wdiff-bugs list has been adding to the recipients list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]