coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tac feature suggestion


From: braultbaron
Subject: Re: tac feature suggestion
Date: Tue, 03 Jun 2014 22:53:43 +0200
User-agent: Webmail


Hello again,


First of all, thanks for your fast and in-depth answer.

Some portions of the discussion have been removed.

Le 2014-06-03 20:27, Pádraig Brady a écrit :
On 06/03/2014 06:13 PM, braultbaron wrote:
I have a feature suggestion for tac.
tac --bytes=<N> <file>
would produce the same output as:
head --bytes=-<N> <file> | tac
The fundamental difference is time and space efficiency.
Note tac first persists non seekable input to a temp file,
and so will have bounded memory usage, but yes it
will have initial overhead in the file copy.

Now the tac usage seems a little unusual.
Since we're dealing in bytes and tac doesn't transform
the size of the data, perhaps we could use dd to skip
the required data in the file before presenting to tac?
So if you had a 3GB file and you wanted to process the last 1GB:

  (dd bs=1 count=0 skip=2GB && tac) < file | head --bytes=1000

We may not know the size that will actually be needed. For example,
if we want to know the position of the last occurence of <expr>
that is before the skipped part:
tac --bytes=<N> <file> | grep -b <expr> | head -n 1
head --bytes=-<N> <file> | tac | grep -b <expr> | head -n 1

This may require to read the whole file, or a very small
part of it, we cannot know in advance.
Though I see that tac doesn't seem to support that currently
as it seeks the whole way from end of file back to 0.
We could store the initial offset and seek only back to there
without any interface changes.

I am really sorry, but I can't understand your last sentence.
Are you talking about a commandline, or about the program?

I would like to emphasize that tac, among the coreutils programs,
has two very particular properties at the same time:
- When it is provided a seekable input, it does not need neither memory,
   nor auxiliary file, nor initial overhead.
   *When seekable input, tac is "optimal"*
- Its combination with most commands -- but not head -- can be rewritten
   in a satisfying way, as follows:
   > grep|tac
   becomes easily:
   > tac|grep
   Or:
   > tail|tac
   becomes:
   > tac|head
   *Most often, we can manage to have a seekable input*

I admit that my proposition is a sort of ad-hoc optimisation of the "head|tac" command.

Now, I don't know whether this is only my personal need, or if
this improvement would benefit to the community.

I guess that, in both cases, I will adapt the existing source
to fit my need. If this change is welcomed, I would be pleased
to send my source code. If someone finds some clever way to
achieve the same result without modifying the existing program,
I will be happy to use the trick instead of the modified program.
The simpler, the better.

About licensing stuff: I don't want my name anywhere,
and all the code I produce must stay free software.


Regards,
Johann BB

P.S. I replied to the whole list. Is it correct, or should I
have replied only to you?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]