bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] cmp to accept "cmp DIR1/FILE DIR2/" etc


From: Chris Chittleborough
Subject: [PATCH] cmp to accept "cmp DIR1/FILE DIR2/" etc
Date: Tue, 17 Aug 2004 23:14:35 +0930
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113

For a several years, the paragraph in diff.texi which explains that
   Unlike @command{diff}, @command{cmp} cannot compare directories;
   it can only compare two files.
has been accompanied by the comment "@c Fix this."  But this is easier
to say than to do.  We would have to define a new output format for cmp
to use when doing multiple comparisons.  This format would have to
include the names of each pair of files in which differences were found.
Presumably, it would also include messages like "Only in /tmp: hello.c".
It would be highly desirable to have --report-identical-files and
--recursive options.  For consistency, cmp should have the same file
selection options as diff: --exclude=PAT, --exclude-from=FILE and
--starting-file=FILE.  Perhaps it should also have --from and --to
options (identical in effect to diff's --from-file and --to-file but not
as misleading in name).  All in all, it would be much easier (and more
user-friendly?) to add --byte-by-byte and --character-by-character
options to diff than to extend cmp this way.

However, it is possible to extend cmp to handle the cases
   cmp dir1/file dir2/
and
   cmp dir1/ dir2/file
by treating them as exactly equivalent to
   cmp dir1/file dir2/file
and the following patch does so.  This is a rather marginal feature, but
it seems to me that its small benefit (I find it does save me some
typing) outweights its cost (a more complex interface).

I have not changed cmp's help text to mention this feature, because it
is too marginal to be worth the number of words it takes me to explain.
(Anyone who can come up with a good, terse explanation is welcome to
patch the help text accordingly.)  Therefore, the auto-generated man
page will not mention this feature.  All it gets is one paragraph in the
texinfo file, which is probably all it deserves.

This patch also changes the name of an important global variable: the
"file" array is now named "file_name".  Most of the patch hunks relate
to this change rather than the new behaviour.

=========================

diff -ru -X /home/chris/.std-diff-exclusions 2.8.7/ChangeLog 2.8.7-CWC/ChangeLog
--- 2.8.7/ChangeLog    2004-04-14 08:10:59.000000000 +0930
+++ 2.8.7-CWC/ChangeLog    2004-08-17 10:41:12.000000000 +0930
@@ -1,3 +1,8 @@
+2004-08-17  Chris Chittleborough  <address@hidden>
+
+    * cmp.c (main): accept "cmp dir1/file dir2/" and "cmp dir1/ dir2/file"
+    as short for "cmp dir1/file dir2/file".
+
2004-04-13  Paul Eggert  <address@hidden>

    * NEWS, configure.ac (AC_INIT): Version 2.8.7.
diff -ru -X /home/chris/.std-diff-exclusions 2.8.7/doc/diff.texi 2.8.7-CWC/doc/diff.texi
--- 2.8.7/doc/diff.texi    2004-04-12 17:14:35.000000000 +0930
+++ 2.8.7-CWC/doc/diff.texi    2004-08-17 17:08:45.776879032 +0930
@@ -409,9 +409,14 @@
produces no output and reports whether the files differ using only its
exit status (@pxref{Invoking cmp}).

address@hidden Fix this.
Unlike @command{diff}, @command{cmp} cannot compare directories; it can only
compare two files.
address@hidden Fix this?  Not easy.  Would need a new output format which 
includes the
address@hidden file names and "Only in /tmp: hello.c" messages.  Would want 
--recursive
address@hidden and --report-identical-files options.  And what about 
--exclude=PAT,
address@hidden etc, etc?
address@hidden Would be easier, and probably more user-friendly, to add 
--byte-by-byte
address@hidden and --character-by-character to diff.

@node Binary
@section Binary Files and Forcing Text Comparisons
@@ -3457,6 +3462,13 @@
ignore at the start of each file; they are equivalent to the
@address@hidden:@var{to-skip}} option.

+When comparing files with the same name in different directories, you
+need only specify the full path to one file as long as you put a slash
+after the name of the other directory. For example,
address@hidden@samp{cmp d1/f d2/}} and
address@hidden@samp{cmp d1/ d2/f}} are both equivalent to
address@hidden@samp{cmp d1/f d2/f}}.
+
By default, @command{cmp} outputs nothing if the two files have the
same contents.  If one file is a prefix of the other, @command{cmp}
prints to standard error a message of the following form:
diff -ru -X /home/chris/.std-diff-exclusions 2.8.7/src/cmp.c 2.8.7-CWC/src/cmp.c
--- 2.8.7/src/cmp.c    2004-04-12 17:14:35.000000000 +0930
+++ 2.8.7-CWC/src/cmp.c    2004-08-17 18:10:04.168678480 +0930
@@ -49,12 +49,16 @@
static size_t block_compare (word const *, word const *);
static size_t block_compare_and_count (word const *, word const *, off_t *);
static void sprintc (char *, unsigned char);
+static char const *append_basename (char const *, size_t, char const *, size_t);

/* Name under which this program was invoked.  */
char *program_name;

/* Filenames of the compared files.  */
-static char const *file[2];
+static char const *file_name[2];
+
+/* String and pointer value denoting standard input.  */
+static const char STDIN_FILENAME[] = "-";

/* File descriptors of the files.  */
static int file_desc[2];
@@ -260,8 +264,16 @@
  if (optind == argc)
    try_help ("missing operand after `%s'", argv[argc - 1]);

-  file[0] = argv[optind++];
-  file[1] = optind < argc ? argv[optind++] : "-";
+  file_name[0] = argv[optind++];
+  if (strcmp (file_name[0], STDIN_FILENAME) == 0)
+    file_name[0] = STDIN_FILENAME;
+  file_name[1] = STDIN_FILENAME ;
+  if (optind < argc)
+    {
+      if (strcmp (argv[optind], STDIN_FILENAME) != 0)
+    file_name[1] = argv[optind] ;
+      optind++;
+    }

  for (f = 0; f < 2 && optind < argc; f++)
    {
@@ -272,27 +284,47 @@
  if (optind < argc)
    try_help ("extra operand `%s'", argv[optind]);

+ /* If one file argument ends with a slash and the other contains a basename,
+     append that basename to the directory name. */
+  if (file_name[0] != STDIN_FILENAME && file_name[1] != STDIN_FILENAME)
+    {
+      size_t file_name_len[2];
+      bool ends_with_slash[2];
+      for (f = 0; f < 2; f++)
+    {
+      file_name_len[f] = strlen (file_name[f]);
+      ends_with_slash[f] = file_name[f][file_name_len[f] - 1] == '/';
+    }
+      if (ends_with_slash[0] && ! ends_with_slash[1])
+    file_name[0] = append_basename (file_name[1], file_name_len[1],
+                    file_name[0], file_name_len[0]);
+      else if (! ends_with_slash[0] && ends_with_slash[1])
+    file_name[1] = append_basename (file_name[0], file_name_len[0],
+                    file_name[1], file_name_len[1]);
+
+    }
+
  for (f = 0; f < 2; f++)
    {
-      /* If file[1] is "-", treat it first; this avoids a misdiagnostic if
-     stdin is closed and opening file[0] yields file descriptor 0.  */
-      int f1 = f ^ (strcmp (file[1], "-") == 0);
+ /* If second file is stdin, treat it first; this avoids a misdiagnostic if
+     stdin is closed and opening file_name[0] yields file descriptor 0.  */
+      int f1 = f ^ (file_name[1] == STDIN_FILENAME);

      /* Two files with the same name and offset are identical.
     But wait until we open the file once, for proper diagnostics.  */
      if (f && ignore_initial[0] == ignore_initial[1]
-      && file_name_cmp (file[0], file[1]) == 0)
+      && file_name_cmp (file_name[0], file_name[1]) == 0)
    return EXIT_SUCCESS;

-      file_desc[f1] = (strcmp (file[f1], "-") == 0
+      file_desc[f1] = (file_name[f1] == STDIN_FILENAME
               ? STDIN_FILENO
-               : open (file[f1], O_RDONLY, 0));
+               : open (file_name[f1], O_RDONLY, 0));
      if (file_desc[f1] < 0 || fstat (file_desc[f1], stat_buf + f1) != 0)
    {
      if (file_desc[f1] < 0 && comparison_type == type_status)
        exit (EXIT_TROUBLE);
      else
-        error (EXIT_TROUBLE, errno, "%s", file[f1]);
+        error (EXIT_TROUBLE, errno, "%s", file_name[f1]);
    }

      set_binary_mode (file_desc[f1], true);
@@ -353,7 +385,7 @@

  for (f = 0; f < 2; f++)
    if (close (file_desc[f]) != 0)
-      error (EXIT_TROUBLE, errno, "%s", file[f]);
+      error (EXIT_TROUBLE, errno, "%s", file_name[f]);
  if (exit_status != 0  &&  comparison_type != type_status)
    check_stdout ();
  exit (exit_status);
@@ -411,7 +443,7 @@
          if (r != bytes_to_read)
        {
          if (r == SIZE_MAX)
-            error (EXIT_TROUBLE, errno, "%s", file[f]);
+            error (EXIT_TROUBLE, errno, "%s", file_name[f]);
          break;
        }
          ig -= r;
@@ -433,10 +465,10 @@

      read0 = block_read (file_desc[0], buf0, bytes_to_read);
      if (read0 == SIZE_MAX)
-    error (EXIT_TROUBLE, errno, "%s", file[0]);
+    error (EXIT_TROUBLE, errno, "%s", file_name[0]);
      read1 = block_read (file_desc[1], buf1, bytes_to_read);
      if (read1 == SIZE_MAX)
-    error (EXIT_TROUBLE, errno, "%s", file[1]);
+    error (EXIT_TROUBLE, errno, "%s", file_name[1]);

      /* Insert sentinels for the block compare.  */

@@ -484,7 +516,7 @@
                         || hard_locale_LC_MESSAGES);

            printf (use_byte_message ? byte_message : char_message,
-                file[0], file[1], byte_num, line_num);
+                file_name[0], file_name[1], byte_num, line_num);
          }
        else
          {
@@ -495,7 +527,7 @@
            sprintc (s0, c0);
            sprintc (s1, c1);
            printf (_("%s %s differ: byte %s, line %s is %3o %s %3o %s\n"),
-                file[0], file[1], byte_num, line_num,
+                file_name[0], file_name[1], byte_num, line_num,
                c0, s0, c1, s1);
        }
          }
@@ -542,7 +574,7 @@
      if (comparison_type != type_status)
        {
          /* See POSIX 1003.1-2001 for this format.  */
-          fprintf (stderr, _("cmp: EOF on %s\n"), file[read1 < read0]);
+ fprintf (stderr, _("cmp: EOF on %s\n"), file_name[read1 < read0]);
        }

      return EXIT_FAILURE;
@@ -675,3 +707,27 @@
    }
  return position[f];
}
+
+/* Given a filename FULL_PATH of length FULL_LEN which contains a file name
+ (ie., does not end with a slash), and another filename DIR_PATH of length + DIR_LEN which ends with a slash (and presumably is the name of a directory),
+   return a freshly-malloced string consisting of DIR_PATH followed by the
+   basename part of FULL_PATH. */
+static char const *
+append_basename (char const * full_path, size_t full_len,
+         char const * dir_path,  size_t dir_len)
+{
+  char const * basename;
+  char const * cp;
+  char * result;
+
+  basename = full_path;
+  for (basename = cp = full_path; *cp != '\0'; cp ++)
+    if (*cp == '/')
+      basename = cp + 1;
+
+  result = xmalloc (full_len + dir_len + 1);
+  strcpy (result, dir_path);
+  strcpy (result + dir_len, basename);
+  return result;
+}





reply via email to

[Prev in Thread] Current Thread [Next in Thread]