[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[patch] Adding Numerical Suffixes to Split

From: Capt Jesse Kornblum USAF
Subject: [patch] Adding Numerical Suffixes to Split
Date: Thu, 07 Aug 2003 16:13:39 -0400
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624

Sorry it's been so long since I've written; the Air Force moved me in the past month.

I'm still interested in adding support for numerical suffixes to split.

Several people have commented that the change from alphabetic suffixes to numerical ones can be accomplished via a shell script. That's true, but my question to you, is: Why? Why should the user have to remember a long line of arcane shell commands instead of just adding a single command line flag that already exists in a sister program (csplit)?

The numerical suffix is *the* standard for forensic images (think dd files, not graphics). Originally developed for a forensic imaging program called Safeback in late 80's, every forensic examination program now uses numerical suffixes to identity parts of a single image. Such programs include EnCase (http://encase.com/), iLook (http://www.ilook-forensics.org/), and Autopsy (http://www.sleuthkit.org/).

Your concern for preserving the split standard is admirable, but I don't think that this patch will break that standard. That is, files created with numerical suffixes can be read in the same manner as files created with alphabetic suffixes. Here's an example with a small text file called "bar." I split the 'bar' file using regular split and then my modified version. Both can then be 'cat'ed back together to be the same file.

$ ls -l bar
-rw-r--r--    1 jessek   users        5120 Aug  7 15:34 bar
$ split -b 500 bar normal
$ /home/jessek/coreutils-5.0/src/split -n -b 500 bar digits
$ ls normal* digits*
digits01 digits04 digits07 digits10 normalab normalae normalah normalak
digits02  digits05  digits08  digits11  normalac  normalaf  normalai
digits03  digits06  digits09  normalaa  normalad  normalag  normalaj
$ cat normal* > all-normal
$ cat digits* > all-digits
$ diff all-normal all-digits

Thus, any set of files created with numerical suffixes is still compatible with any set of files created with alphabetic suffixes. I'm open to argument on this point, of course. :)

Somebody asked if csplit would work for our purposes. Unfortunately we need a program that splits files based on their size. csplit looks only at a file's content.

Below, as requested, is the formatted patch for split to allow numerical suffixes. (I had some CVS issues, but think I got it right.)

2003-08-08  Jesse Kornblum  <address@hidden>

       Add support for numerical suffixes in split

       * src/split.c - Add support for -n

Index: patch.c
RCS file: /cvsroot/coreutils/coreutils/patch,v
retrieving revision x.x
diff -p -u -rx.x patch.c
--- split-orig.c        Thu Aug  7 15:00:25 2003
+++ split.c     Thu Aug  7 15:11:11 2003
@@ -63,6 +63,9 @@ static size_t suffix_length = DEFAULT_SU
/* Name of input file.  May be "-".  */
static char *infile;

+/* If non-zero, use numeric suffixes instead of characters */
+static int suffix_type;
/* Descriptor on which input file is open.  */
static int input_desc;

@@ -78,6 +81,7 @@ static struct option const longopts[] =
  {"bytes", required_argument, NULL, 'b'},
  {"lines", required_argument, NULL, 'l'},
  {"line-bytes", required_argument, NULL, 'C'},
+  {"numbers", no_argument, &suffix_type, 'n'},
  {"suffix-length", required_argument, NULL, 'a'},
  {"verbose", no_argument, &verbose, 0},
@@ -109,6 +113,7 @@ Mandatory arguments to long options are
  -a, --suffix-length=N   use suffixes of length N (default %d)\n\
  -b, --bytes=SIZE        put SIZE bytes per output file\n\
-C, --line-bytes=SIZE put at most SIZE bytes of lines per output file\n\
+  -n, --numbers           use digits for suffixes instead of numbers\n\
  -l, --lines=NUMBER      put NUMBER lines per output file\n\
      fputs (_("\
@@ -143,7 +148,12 @@ next_file_name (void)
      outfile = xmalloc (outfile_length + 1);
      outfile_mid = outfile + outbase_length;
      memcpy (outfile, outbase, outbase_length);
-      memset (outfile_mid, 'a', suffix_length);
+      if (!suffix_type)
+       memset (outfile_mid, 'a', suffix_length);
+      else {
+       memset (outfile_mid, '0', suffix_length);
+       outfile_mid[suffix_length - 1] = '1';
+      }
      outfile[outfile_length] = 0;

@@ -165,10 +175,17 @@ next_file_name (void)
      /* Increment the suffix in place, if possible.  */
      char *p;
-      for (p = outfile_mid + suffix_length; outfile_mid < p; *--p = 'a')
-       if (p[-1]++ != 'z')
-         return;
-      error (EXIT_FAILURE, 0, _("Output file suffixes exhausted"));
+      if (!suffix_type) {
+       for (p = outfile_mid + suffix_length; outfile_mid < p; *--p = 'a')
+         if (p[-1]++ != 'z')
+           return;
+       error (EXIT_FAILURE, 0, _("Output file suffixes exhausted"));
+      } else {
+       for (p = outfile_mid + suffix_length; outfile_mid < p; *--p = '0')
+         if (p[-1]++ != '9')
+           return;
+       error (EXIT_FAILURE, 0, _("Output file suffixes exhausted"));
+      }

@@ -376,7 +393,7 @@ main (int argc, char **argv)
      int this_optind = optind ? optind : 1;
      long int tmp_long;

-      c = getopt_long (argc, argv, "0123456789C:a:b:l:", longopts, NULL);
+      c = getopt_long (argc, argv, "0123456789nC:a:b:l:", longopts, NULL);
      if (c == -1)

@@ -385,6 +402,10 @@ main (int argc, char **argv)
       case 0:

+       case 'n':
+         suffix_type = 1;
+         break;
       case 'a':
           unsigned long tmp;

Jesse Kornblum, Capt, USAF
United States Naval Academy
Chauvenet Room 329
572 Holloway Rd. Stop 9F
Annapolis, MD 21402-5002
Comm 410-293-6821  DSN 281-6821
Fax 410-293-2686  Fax DSN 281-2686
e-mail: address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]