[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: RFC: dd oflag=trunc to support in place filtering of files
From: |
Bernhard Voelker |
Subject: |
Re: RFC: dd oflag=trunc to support in place filtering of files |
Date: |
Fri, 06 Jun 2014 08:34:45 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 |
On 06/05/2014 03:27 PM, Pádraig Brady wrote:
> The thought just occurred to me that this could be useful
> to filter large files in place? For example:
>
> grep whatever file.big | dd bs=1M conv=notrunc oflag=trunc
I guess you meant this:
grep whatever file.big | dd bs=1M conv=notrunc oflag=trunc \
of=file.big
?
> That would assume that grep never outputs more than it reads,
> and would issue a final truncate along the lines of:
>
> ftruncate(STDOUT_FILENO, lseek(STDOUT_FILENO, 0, SEEK_CUR));
>
> Useful enough to add?
While it sounds very useful, it looks like a powerful
way to shoot oneself in the foot, e.g. when the producer
command aborts
grep --unknown PAT file | dd ...
grep: unrecognized option '--unknown'
... then dd probably wouldn't be able to detect
the failure and truncate the file - so the original data would
be lost.
Second, regarding the already mentioned restriction that the
producer doesn't output more data than the original size of
the input file, e.g.
cat -n file | dd conv=notrunc of=file ...
Is this really an issue? It (surprisingly!) already seems to
work, even with "obs=1". And if it is, how could we detect this?
As a side note, "oflag=trunc" may not be enough to describe
what it does ... it truncates the output file *after* the
data copying. So what about something like "oflag=truncpost"?
Have a nice day,
Berny