[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
split: support unlimited number of split files
From: |
Jérémy Compostella |
Subject: |
split: support unlimited number of split files |
Date: |
Fri, 24 Feb 2012 23:08:18 +0100 |
All,
I'm interesting in implementing this feature. In fact, I already made a
quick implementation to play with.
I refer to the original thread : "split behavior"
http://lists.gnu.org/archive/html/bug-coreutils/2009-09/msg00217.html
To summarise it (quick version), in the past the split command provided
this unlimited number of split files as its default behavior. But it did
not conform to POSIX, so it has been removed (see
http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commit;h=65cbf7d1).
This old behavior was:
$ cat /var/log/messages | split -2 - /tmp/x.
x.aa
x.ab
...
x.yz
x.zaaa
x.zaab
...
x.zyzz
x.zzaaaa
x.zzaaab
But, others in the "split behavior" thread propose something like:
x.aa
...
x.zz
x.zzaa
...
x.zzzz
x.zzzzaa
These two possibilities deserves the same goal, split files order, once
alphabetically sorted, is the correct order.
However, the second possibility does not satisfy me since it will make the
use of the --additional-suffix option break this:
$ cat /var/log/messages | split --additional-suffix=.txt -2 - /tmp/x. && ls
/tmp/x.* | sort
x.aa.txt
...
x.zy.txt
x.zzaa.txt
...
x.zztw.txt
x.zz.txt <---- :(
x.zztx.txt
...
Therefore, my opinion is : the old behavior is more adapted to the
current split option set.
In the "split behavior" thread it was proposed to look at the
POSIXLY_CORRECT environment variable to activate or not the unlimited
split files behavior. But, I think it's dangerous. Indeed, it breaks the
usual files list: x.aa ... x.zz ... vs. x.aa ... x.yz x.zaa .. (the x.zz
file does not exist anymore). User may be surprised and older scripts
may failed.
Maybe adding a new option or a new argument would be fine, I was
thinking to the following:
* --unlimited-suffixes
* --suffix-length=unlimited or --suffix-length=auto
With this new option (or argument), user would keep the ability to
select the start suffix length. For example:
$ cat /var/log/messages | split --suffix-length=auto --suffix-length 3 -2 -
/tmp/x.
x.aaa <--- start with suffix length = 3
x.aab
...
x.yzz
x.zaaaa
x.zaaab
...
x.zyzzz
x.zzaaaaa
x.zzaaaab
Cheers,
Jérémy
- split: support unlimited number of split files,
Jérémy Compostella <=