[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#11761: Slight bug in split :-)
From: |
Pádraig Brady |
Subject: |
bug#11761: Slight bug in split :-) |
Date: |
Fri, 22 Jun 2012 00:43:57 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20110816 Thunderbird/6.0 |
On 06/21/2012 11:12 PM, Jim Meyering wrote:
> François Pinard wrote:
>> Hi, Jim.
>>
>> I was looking for a problematic spot from a big file, and to isolate it,
>> used "split" repeatedly as a way to zoom into the proper place. Just to
>> try, I used "split -C 100000 xad" at one place (after saving "xad"
>> first, of course). "split" interrupted itself, producing less output
>> than input.
>>
>> My suggestion would be that split moans in some way before it destroys
>> its own input. :-)
>>
>> François
>
> Hi François!
> Thank you for reporting that.
> That's definitely a bug.
>
> For the record, here's a quick reproducer:
>
> $ seq 10 > xaa
> $ split -C 6 xaa
> $ wc -c x??
> 6 xaa
> 1 xab
> 7 total
> $ head x??
> ==> xaa <==
> 1
> 2
> 3
>
> ==> xab <==
> 3$
>
> I've Cc'd the bug list, in case someone would like to write
> the patch (fix, NEWS and test) before I get to it.
> I may not have time tomorrow.
Nice catch :)
I'll fix it up with something like the following.
cheers,
Pádraig.
diff --git a/src/split.c b/src/split.c
index 53ee271..3e3313a 100644
--- a/src/split.c
+++ b/src/split.c
@@ -92,6 +92,9 @@ static char const *additional_suffix;
/* Name of input file. May be "-". */
static char *infile;
+/* stat buf for input file. */
+static struct stat in_stat_buf;
+
/* Descriptor on which output file is open. */
static int output_desc = -1;
@@ -362,6 +365,17 @@ create (const char *name)
{
if (verbose)
fprintf (stdout, _("creating file %s\n"), quote (name));
+
+ struct stat out_stat_buf;
+ if (stat (name, &out_stat_buf) == 0)
+ {
+ if (SAME_INODE (in_stat_buf, out_stat_buf))
+ error (EXIT_FAILURE, 0, _("%s would overwrite input. Aborting."),
+ quote (name));
+ }
+ else if (errno != ENOENT)
+ error (EXIT_FAILURE, errno, _("cannot stat %s"), quote (name));
+
return open (name, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
(S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH))
}
@@ -1058,7 +1072,6 @@ parse_chunk (uintmax_t *k_units, uintmax_t *n_units, char
int
main (int argc, char **argv)
{
- struct stat stat_buf;
enum Split_type split_type = type_undef;
size_t in_blk_size = 0; /* optimal block size of input file device */
char *buf; /* file i/o buffer */
@@ -1335,16 +1348,16 @@ main (int argc, char **argv)
/* Get the optimal block size of input device and make a buffer. */
- if (fstat (STDIN_FILENO, &stat_buf) != 0)
+ if (fstat (STDIN_FILENO, &in_stat_buf) != 0)
error (EXIT_FAILURE, errno, "%s", infile);
if (in_blk_size == 0)
- in_blk_size = io_blksize (stat_buf);
+ in_blk_size = io_blksize (in_stat_buf);
if (split_type == type_chunk_bytes || split_type == type_chunk_lines)
{
off_t input_offset = lseek (STDIN_FILENO, 0, SEEK_CUR);
- if (usable_st_size (&stat_buf))
- file_size = stat_buf.st_size;
+ if (usable_st_size (&in_stat_buf))
+ file_size = in_stat_buf.st_size;
else if (0 <= input_offset)
{
file_size = lseek (STDIN_FILENO, 0, SEEK_END);