bug-diffutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-diffutils] Re: Bug#577832: diffutils: [REGRESSION] newline is added


From: Jim Meyering
Subject: [bug-diffutils] Re: Bug#577832: diffutils: [REGRESSION] newline is added to a line that has no newline if the line is in context (fwd)
Date: Fri, 16 Apr 2010 11:31:20 +0200

Jim Meyering wrote:
> Santiago Vila wrote:
>> Received this from the Debian bug system today.
>> This bug is not present in diffutils 2.8.1.
>>
>> ---------- Forwarded message ----------
>> From: Timo Juhani Lindfors <address@hidden>
>> To: Debian Bug Tracking System <address@hidden>
>> Date: Thu, 15 Apr 2010 02:37:35 +0300
>> Subject: Bug#577832: diffutils: [REGRESSION] newline is added to a line that 
>> has no newline if the line is in context
>>
>> Package: diffutils
>> Version: 1:2.9-2
>> Severity: important
>>
>> Steps to reproduce:
>> 1) echo -en '\nline1\nline2\nline3' > a
>> 2) echo -en '\nnew line\n\nline1\nline2\nline3' > b
>> 3) diff -u a b > c
>
> Thanks a lot for the report and forward.
> Here's a patch that fixes it, but I don't claim that
> it is optimal -- I don't know diffutils well enough for that.
> I hope Paul Eggert (Cc'd) can comment.
>
> I will write at least two tests for this and update NEWS before
> pushing anything tomorrow.

I've done that, and updated init.sh and gnulib to latest.
Here's what I've just pushed:

>From 648802169a29ea096d6e9813b043b91af9342f8d Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Fri, 16 Apr 2010 08:47:28 +0200
Subject: [PATCH 1/4] tests: update init.sh from gnulib

* tests/init.sh: Update from gnulib.
---
 tests/init.sh |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/init.sh b/tests/init.sh
index 3e3ea44..ee9c542 100644
--- a/tests/init.sh
+++ b/tests/init.sh
@@ -259,7 +259,7 @@ rand_bytes_()
   if test -r "$dev_rand_"; then
     # Note: 256-length($chars_) == 194; 3 copies of $chars_ is 186 + 8 = 194.
     dd ibs=$n_ count=1 if=$dev_rand_ 2>/dev/null \
-      | tr -c $chars_ 01234567$chars_$chars_$chars_
+      | LC_ALL=C tr -c $chars_ 01234567$chars_$chars_$chars_
     return
   fi

@@ -276,7 +276,7 @@ rand_bytes_()

   echo "$data_" \
     | dd bs=1 skip=50 count=$n_ 2>/dev/null \
-    | tr -c $chars_ 01234567$chars_$chars_$chars_
+    | LC_ALL=C tr -c $chars_ 01234567$chars_$chars_$chars_
 }

 mktempd_()
@@ -306,7 +306,7 @@ mktempd_()
   fail=0

   # First, try to use mktemp.
-  d=`env -u TMPDIR mktemp -d -t -p "$destdir_" "$template_" 2>/dev/null` \
+  d=`unset TMPDIR; mktemp -d -t -p "$destdir_" "$template_" 2>/dev/null` \
     || fail=1

   # The resulting name must be in the specified directory.
--
1.7.1.rc1.269.ga27c7


>From f444711a2639b4974ff3e720c455a1f96f1109e8 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Thu, 15 Apr 2010 23:26:22 +0200
Subject: [PATCH 2/4] diff: fix a regression when at least one input lacks a 
newline-at-EOF,

and the final hunk plus context-length aligns exactly with the end
of the newline-lacking file.  Diff would fail to output the required
"\ No newline at end of file" line, thus rendering the output invalid.
This bug appears to have been introduced by 2006-05-07
commit 58d0483b, "(find_identical_ends): Fix huge performance bug...",
at least to the extent that reverting that change fixes the bug.
Considering the stated effect of that change and lack of metrics,
reverting it is not an option, so here we take a more direct approach.

Given these inputs,

    printf '\n1'>a; printf '\n0\n\n1'>b

and running diff like this:

    ./diff -U1 a b

for input file "b", the pointer, files[1].linbuf[4][-1], to
the last byte on the final line was mistakenly pointing at the
sentinel newline at EOF, rather than at the preceding byte.

  (gdb) p files[1].linbuf[4][-1]
  $3 = 10 '\n'

Thus, this test in the final print_1_line call:

  if ((!line_flag || line_flag[0]) && limit[-1] != '\n')
    fprintf (out, "\n\\ %s\n", _("No newline at end of file"));

would fail, because limit[-1] (which is files[1].linbuf[4][-1])
was mistakenly '\n', rather than the desired '1'.

My first thought was simply to adjust the final linbuf[line] setting,
at the end of io.c's find_and_hash_each_line function function:

       if (p == bufend)
-       break;
+       {
+         if (current->missing_newline)
+           --linbuf[line];
+         break;
+       }

But that would make diff misbehave with this input
(same as above, but with a newline appended to "a"),

    printf '\n1\n'>a; printf '\n0\n\n1'>b
    ./diff -U1 a b

due to the block (100 lines above) that is triggered in that case
(but not in the both-files-missing-newline case):

      if (p == bufend
          && current->missing_newline
          && ROBUST_OUTPUT_STYLE (output_style))
        {
          /* This line is incomplete.  If this is significant,
             put the line into buckets[-1].  */
          if (ignore_white_space < IGNORE_SPACE_CHANGE)
            bucket = &buckets[-1];

          /* Omit the inserted newline when computing linbuf later.  */
          p--;
          bufend = suffix_begin = p;
        }

Note how "p" is decremented and "bufend" adjusted.
When that happens, we certainly don't want to decrement
"bufend" yet again.

Since there is no other way to determine at the end whether "bufend"
was already decremented, add a new variable to serve as witness.

* NEWS (Bug fixes): Mention it.
Reported by Timo Juhani Lindfors in http://bugs.debian.org/577832.
Forwarded by Santiago Vila.
---
 NEWS     |    7 +++++++
 src/io.c |   11 ++++++++++-
 2 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/NEWS b/NEWS
index 2d20b40..88af7ef 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,13 @@ GNU diffutils NEWS                                    -*- 
outline -*-

 * Noteworthy changes in release ?.? (????-??-??) [?]

+** Bug fixes
+
+  diff once again prints the required "\ No newline at end of file" line
+  when at least one input lacks a newline-at-EOF and the final hunk plus
+  context-length aligns exactly with the end of the newline-lacking file.
+  [bug introduced between 2.8.7 and 2.9]
+
 ** Changes in behavior

   In context-style diffs, diff prints a portion of a preceding "function"
diff --git a/src/io.c b/src/io.c
index aeeb51d..11f9ee3 100644
--- a/src/io.c
+++ b/src/io.c
@@ -220,6 +220,7 @@ find_and_hash_each_line (struct file_data *current)
   bool same_length_diff_contents_compare_anyway =
     diff_length_compare_anyway | ignore_case;

+  bool missing_newline_fixup = false;
   while (p < suffix_begin)
     {
       char const *ip = p;
@@ -376,6 +377,7 @@ find_and_hash_each_line (struct file_data *current)
          && current->missing_newline
          && ROBUST_OUTPUT_STYLE (output_style))
        {
+         missing_newline_fixup = true;
          /* This line is incomplete.  If this is significant,
             put the line into buckets[-1].  */
          if (ignore_white_space < IGNORE_SPACE_CHANGE)
@@ -471,7 +473,14 @@ find_and_hash_each_line (struct file_data *current)
       linbuf[line] = p;

       if (p == bufend)
-       break;
+       {
+         /* If we've added a newline sentinel and did not adjust "bufend"
+            above, then linbuf[line] is now pointing at the sentinel, yet
+            should instead be pointing to the preceding byte.  */
+         if (!missing_newline_fixup && current->missing_newline)
+           --linbuf[line];
+         break;
+       }

       if (context <= i && no_diff_means_no_output)
        break;
--
1.7.1.rc1.269.ga27c7


>From 6bb3d2900866c23da01a409aa3447ecf68627fc1 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Fri, 16 Apr 2010 09:14:04 +0200
Subject: [PATCH 3/4] tests: test for the no-newline-at-EOF bug

* tests/no-newline-at-eof: New file.
* tests/Makefile.am (TESTS): Add it.
---
 tests/Makefile.am       |    1 +
 tests/no-newline-at-eof |   54 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 55 insertions(+), 0 deletions(-)
 create mode 100644 tests/no-newline-at-eof

diff --git a/tests/Makefile.am b/tests/Makefile.am
index a511b0b..6a4858c 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -4,6 +4,7 @@ TESTS = \
   help-version \
   function-line-vs-leading-space \
   label-vs-func        \
+  no-newline-at-eof \
   stdin

 EXTRA_DIST = \
diff --git a/tests/no-newline-at-eof b/tests/no-newline-at-eof
new file mode 100644
index 0000000..c3694a1
--- /dev/null
+++ b/tests/no-newline-at-eof
@@ -0,0 +1,54 @@
+#!/bin/sh
+# exercise the no-newline-at-EOF bug
+# Before the April 2010 fix, the "\ No newline at end of file"
+# line would not be printed.
+
+: ${srcdir=.}
+. "$srcdir/init.sh"; path_prepend_ ../src
+
+printf '\n1'      > a || framework_failure_
+printf '\n0\n\n1' > b || framework_failure_
+cat <<EOF > exp || framework_failure_
+@@ -1,2 +1,4 @@
+
++0
++
+ 1
+\ No newline at end of file
+EOF
+
+cat <<EOF > exp2 || framework_failure_
+@@ -1,2 +1,4 @@
+
+-1
++0
++
++1
+\ No newline at end of file
+EOF
+
+fail=0
+
+# So we don't have to record trailing blanks in expected output above.
+opt=--suppress-blank-empty
+
+diff $opt -U2 a b > out 2> err
+test $? = 1 || fail=1
+
+sed -n '/^@@/,$p' out > k && mv k out || fail=1
+compare out exp || fail=1
+# expect empty stderr
+compare err /dev/null || fail=1
+
+# Repeat, but with a newline at the end of "a".
+echo >> a
+
+diff $opt -U2 a b > out 2> err
+test $? = 1 || fail=1
+
+sed -n '/^@@/,$p' out > k && mv k out || fail=1
+compare out exp2 || fail=1
+# expect empty stderr
+compare err /dev/null || fail=1
+
+Exit $fail
--
1.7.1.rc1.269.ga27c7


>From 16e65488ddd26fe0ad3f8d8ebd30709c9291e6dd Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Fri, 16 Apr 2010 11:18:10 +0200
Subject: [PATCH 4/4] build: update gnulib submodule to latest

---
 gnulib |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/gnulib b/gnulib
index 177f525..e764ea2 160000
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit 177f5256e8632718eb5d2bb6aecb3f9eec614baf
+Subproject commit e764ea2fcb9a6247ae4dc0d4c0566b43d950a694
--
1.7.1.rc1.269.ga27c7




reply via email to

[Prev in Thread] Current Thread [Next in Thread]