[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[patch] Adding Numerical Suffixes to Split
From: |
Capt Jesse Kornblum USAF |
Subject: |
[patch] Adding Numerical Suffixes to Split |
Date: |
Thu, 07 Aug 2003 16:13:39 -0400 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624 |
Sorry it's been so long since I've written; the Air Force moved me in
the past month.
I'm still interested in adding support for numerical suffixes to split.
Several people have commented that the change from alphabetic suffixes
to numerical ones can be accomplished via a shell script. That's true,
but my question to you, is: Why? Why should the user have to remember a
long line of arcane shell commands instead of just adding a single
command line flag that already exists in a sister program (csplit)?
The numerical suffix is *the* standard for forensic images (think dd
files, not graphics). Originally developed for a forensic imaging
program called Safeback in late 80's, every forensic examination program
now uses numerical suffixes to identity parts of a single image. Such
programs include EnCase (http://encase.com/), iLook
(http://www.ilook-forensics.org/), and Autopsy (http://www.sleuthkit.org/).
Your concern for preserving the split standard is admirable, but I don't
think that this patch will break that standard. That is, files created
with numerical suffixes can be read in the same manner as files created
with alphabetic suffixes. Here's an example with a small text file
called "bar." I split the 'bar' file using regular split and then my
modified version. Both can then be 'cat'ed back together to be the same
file.
$ ls -l bar
-rw-r--r-- 1 jessek users 5120 Aug 7 15:34 bar
$ split -b 500 bar normal
$ /home/jessek/coreutils-5.0/src/split -n -b 500 bar digits
$ ls normal* digits*
digits01 digits04 digits07 digits10 normalab normalae normalah
normalak
digits02 digits05 digits08 digits11 normalac normalaf normalai
digits03 digits06 digits09 normalaa normalad normalag normalaj
$ cat normal* > all-normal
$ cat digits* > all-digits
$ diff all-normal all-digits
$
Thus, any set of files created with numerical suffixes is still
compatible with any set of files created with alphabetic suffixes. I'm
open to argument on this point, of course. :)
Somebody asked if csplit would work for our purposes. Unfortunately we
need a program that splits files based on their size. csplit looks only
at a file's content.
Below, as requested, is the formatted patch for split to allow numerical
suffixes. (I had some CVS issues, but think I got it right.)
2003-08-08 Jesse Kornblum <address@hidden>
Add support for numerical suffixes in split
* src/split.c - Add support for -n
Index: patch.c
===================================================================
RCS file: /cvsroot/coreutils/coreutils/patch,v
retrieving revision x.x
diff -p -u -rx.x patch.c
--- split-orig.c Thu Aug 7 15:00:25 2003
+++ split.c Thu Aug 7 15:11:11 2003
@@ -63,6 +63,9 @@ static size_t suffix_length = DEFAULT_SU
/* Name of input file. May be "-". */
static char *infile;
+/* If non-zero, use numeric suffixes instead of characters */
+static int suffix_type;
+
/* Descriptor on which input file is open. */
static int input_desc;
@@ -78,6 +81,7 @@ static struct option const longopts[] =
{"bytes", required_argument, NULL, 'b'},
{"lines", required_argument, NULL, 'l'},
{"line-bytes", required_argument, NULL, 'C'},
+ {"numbers", no_argument, &suffix_type, 'n'},
{"suffix-length", required_argument, NULL, 'a'},
{"verbose", no_argument, &verbose, 0},
{GETOPT_HELP_OPTION_DECL},
@@ -109,6 +113,7 @@ Mandatory arguments to long options are
-a, --suffix-length=N use suffixes of length N (default %d)\n\
-b, --bytes=SIZE put SIZE bytes per output file\n\
-C, --line-bytes=SIZE put at most SIZE bytes of lines per output
file\n\
+ -n, --numbers use digits for suffixes instead of numbers\n\
-l, --lines=NUMBER put NUMBER lines per output file\n\
"), DEFAULT_SUFFIX_LENGTH);
fputs (_("\
@@ -143,7 +148,12 @@ next_file_name (void)
outfile = xmalloc (outfile_length + 1);
outfile_mid = outfile + outbase_length;
memcpy (outfile, outbase, outbase_length);
- memset (outfile_mid, 'a', suffix_length);
+ if (!suffix_type)
+ memset (outfile_mid, 'a', suffix_length);
+ else {
+ memset (outfile_mid, '0', suffix_length);
+ outfile_mid[suffix_length - 1] = '1';
+ }
outfile[outfile_length] = 0;
#if ! _POSIX_NO_TRUNC && HAVE_PATHCONF && defined _PC_NAME_MAX
@@ -165,10 +175,17 @@ next_file_name (void)
/* Increment the suffix in place, if possible. */
char *p;
- for (p = outfile_mid + suffix_length; outfile_mid < p; *--p = 'a')
- if (p[-1]++ != 'z')
- return;
- error (EXIT_FAILURE, 0, _("Output file suffixes exhausted"));
+ if (!suffix_type) {
+ for (p = outfile_mid + suffix_length; outfile_mid < p; *--p = 'a')
+ if (p[-1]++ != 'z')
+ return;
+ error (EXIT_FAILURE, 0, _("Output file suffixes exhausted"));
+ } else {
+ for (p = outfile_mid + suffix_length; outfile_mid < p; *--p = '0')
+ if (p[-1]++ != '9')
+ return;
+ error (EXIT_FAILURE, 0, _("Output file suffixes exhausted"));
+ }
}
}
@@ -376,7 +393,7 @@ main (int argc, char **argv)
int this_optind = optind ? optind : 1;
long int tmp_long;
- c = getopt_long (argc, argv, "0123456789C:a:b:l:", longopts, NULL);
+ c = getopt_long (argc, argv, "0123456789nC:a:b:l:", longopts, NULL);
if (c == -1)
break;
@@ -385,6 +402,10 @@ main (int argc, char **argv)
case 0:
break;
+ case 'n':
+ suffix_type = 1;
+ break;
+
case 'a':
{
unsigned long tmp;
--
Jesse Kornblum, Capt, USAF
United States Naval Academy
Chauvenet Room 329
572 Holloway Rd. Stop 9F
Annapolis, MD 21402-5002
Comm 410-293-6821 DSN 281-6821
Fax 410-293-2686 Fax DSN 281-2686
e-mail: address@hidden
http://www.cs.usna.edu/~kornblum/
- [patch] Adding Numerical Suffixes to Split,
Capt Jesse Kornblum USAF <=