coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] dd: add punchhole feature


From: Maxime de Roucy
Subject: Re: [PATCH] dd: add punchhole feature
Date: Mon, 13 Feb 2017 22:32:38 +0100

Le lundi 06 février 2017 à 20:19 -0800, Pádraig Brady a écrit :
> On 03/02/17 04:58, address@hidden wrote:
> > I sometimes face some machine with big log file that take 90% of
> > partition space.
> > If those logs are importants I can't just remove it to free space
> > and have to archive it (gzip usually).
> > But the log file + it's archive doesn't fit in the partition so I
> > can't just `gzip my.log`.
> > On situation like these I usually do :
> > 
> >     $ gzip -c my.log | dd of=my.log conv=notrunc
> >     …
> >     X bytes (…) copied, …
> >     $ truncate -s X my.log
> > 
> > But when my.log is opened by another process it's not recommended ;
> > as I would ending up with my.log containing a zip and new logs (non
> > zipped) at the end.
> > 
> > I end-up developing: https://github.com/tchernomax/dump-deallocate
> > A some utility that output and deallocate (fallocate punch-hole) a
> > file at the same time.
> > 
> > I think it would be interesting to include this feature in dd so it
> > would be possible to do:
> > 
> >     $ dd if=my.log conv=punchhole | gzip > my.log.gzip
> 
> That's not a robust operation as if gzip fails for any reason
> like disk full etc. some data will be lost.

Indeed. I didn't think it was a problem as dd is a tool to use with
care.
I will add a warning in the man page.

> So while punchhole functionality might be useful,
> I'm not so sure about coupling it just with read()?
> BTW there is already a punch_hole() function in copy.c
> that should be reused if we were to add this.

I will use this function.

> The reason we haven't added just punchhole functionality to dd,
> is because it's already available from fallocate(1).

But fallocate can't output the data it erase.

> It seems like a specialized tool to couple the following ops would be
> required:
> 
>   while (read(chunk))
>     compress
>     write
>     if (sync())
>       collapse_range(chunk)
> 
> Note I used collapse_range rather than punch_hole there
> as that would probably simplify restarts for partial completions,
> as only the unprocessed data would be left in the file.

It would be the safest but it means compressing the file in dd.
Which is not what this tool is for (AFAIK).

Also I think using collapse_range isn't a good idea. It become
difficult to handle when the input file is write open by another
process.
-- 
Regards
Maxime de Roucy

Attachment: signature.asc
Description: This is a digitally signed message part


reply via email to

[Prev in Thread] Current Thread [Next in Thread]