Re: split: support unlimited number of split files

coreutils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: split: support unlimited number of split files

From:	Pádraig Brady
Subject:	Re: split: support unlimited number of split files
Date:	Sat, 10 Mar 2012 04:43:30 +0000
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20110816 Thunderbird/6.0

On 03/01/2012 07:44 PM, Jérémy Compostella wrote:
> Hi Pádraig,
> 
> Finally, I found time to work again on this patch. I provided it as
> attachment.
> 
> I'm not completely satisfied with the documentation part. I've tried to
> be more specific but it becomes quickly complicated. So I get back the
> original explanation which does not completely satisfied me since it
> does not explain the following point:
> - When -a is not specified:
>   - output file names are considered exhausted when the first suffix
>     character will become 'z'.
>   -'z[a-z]' suffixes are never used
>   - the fact that suffix length is increased each time the suffix are
>   "exhausted".
> But maybe it's not relevant to be so specific.
> 
> What is your opinion about this point ?

Well documentation can easily include too much detail.
But one should try to clarify any ambiguities
a user might be considering.

I adjusted that documentation as follows:

@@ -2990,11 +2990,15 @@ The output files' names consist of @var{prefix} (@samp{>
 followed by a group of characters (@samp{aa}, @samp{ab}, @dots{} by
 default), such that concatenating the output files in traditional
 sorted order by file name produces the original input file (except
-@option{-nr/@var{n}}). If more than 676 output files are required and
-@samp{-a} is not specified, @command{split} uses @samp{zaaa},
-@samp{zaab}, etc.  If the @option{-a} is specified and the output file
-names are exhausted, @command{split} reports an error without deleting
-the output files that it did create.
+@option{-nr/@var{n}}).  By default split will initially create files
+with two generated suffix characters, and will increase this width by two
+when the next most significant position reaches the last character.
+(@samp{yz}, @samp{zaaa}, @dots{}).  In this way an arbitrary number
+of output files are supported which sort as described above,
+even in the presence of an @option{--additional-suffix} option.
+If the @option{-a} option is specified and the output file names are exhausted,
+@command{split} reports an error without deleting the output files
+that it did create.

I made a few adjustments to the code also.

Disable this feature if --numeric=FROM is specified,
because these suffixes are not all consecutive
and hence we can't support arbitrary FROM values.

Disable this feature if a specific number of
files is specified with the -n option.
We already auto select the width appropriately.

I could find no logic flaws in the implementation in new_file_name().
Well there was one nit with:

  sufindex = xrealloc (sufindex, suffix_length * sizeof *sufindex);

That should be as follows to avoid overflow issues:

  sufindex = xnrealloc (sufindex, suffix_length, sizeof *sufindex);

While looking at that I noticed a fair bit of overlap
from the existing fixed width suffix initialization,
so I merged the two with a goto.  That brought the
new implementation down from 15 to 8 significant lines,
and those 8 are now mostly simple integer width adjustments.
Also by doing this we get to take advantage of the _PC_NAME_MAX
check in this section.

I also added a test.

Full patch is attached.
I hope to apply this sometime tomorrow.

cheers,
Pádraig.

split-suffix-auto-len.diff
Description: Text document

[Prev in Thread]

Current Thread

[Next in Thread]

Re: split: support unlimited number of split files, Jérémy Compostella, 2012/03/01
- Re: split: support unlimited number of split files, Pádraig Brady, 2012/03/01
  - Re: split: support unlimited number of split files, Jérémy Compostella, 2012/03/02
    - Re: split: support unlimited number of split files, Pádraig Brady, 2012/03/02
    - Re: split: support unlimited number of split files, Jérémy Compostella, 2012/03/02
    - Re: split: support unlimited number of split files, Pádraig Brady, 2012/03/02
    - Re: split: support unlimited number of split files, Jérémy Compostella, 2012/03/02
    - Re: split: support unlimited number of split files, Pádraig Brady, 2012/03/02
    - Re: split: support unlimited number of split files, Jérémy Compostella, 2012/03/02
    - Re: split: support unlimited number of split files, Jérémy Compostella, 2012/03/09
- Re: split: support unlimited number of split files, Pádraig Brady <=
  - Re: split: support unlimited number of split files, Jérémy Compostella, 2012/03/10
    - Re: split: support unlimited number of split files, Pádraig Brady, 2012/03/10
    - Re: split: support unlimited number of split files, Jérémy Compostella, 2012/03/10
    - Re: split: support unlimited number of split files, Pádraig Brady, 2012/03/10

Prev by Date: Re: split: support unlimited number of split files
Next by Date: Re: bug#10983: ready for a release?
Previous by thread: Re: split: support unlimited number of split files
Next by thread: Re: split: support unlimited number of split files
Index(es):
- Date
- Thread