coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Add wipename option to shred


From: Joseph D. Wagner
Subject: Re: [PATCH] Add wipename option to shred
Date: Tue, 02 Jul 2013 00:16:28 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130625 Thunderbird/17.0.7


On 06/27/2013 10:06 AM, Pádraig Brady wrote:
On 06/13/2013 05:13 PM, Joseph D. Wagner wrote:
On 06/13/2013 8:35 am, Pádraig Brady wrote:

On 06/13/2013 12:51 AM, Joseph D. Wagner wrote:

## perchar ##
real    678m33.468s
user    0m9.450s
sys    3m20.001s

## once ##
real    151m54.655s
user    0m3.336s
sys    0m32.357s

## none ##
real    107m34.307s
user    0m2.637s
sys    0m21.825s

perchar: 11 hours 18 minutes 33.468 seconds
once: 2 hours 31 minutes 54.655 seconds
  * a 346% improvement over perchar
none: 1 hour 47 minutes 34.307 seconds
  * a 530% improvement over perchar
  * a 41% improvement over once
Whoa, so this creates 23s CPU work
but waits for 1 hour 47 mins on the sync!
What file system and backing device are you using here
as a matter of interest?
ext4 data=ordered (default) + 7200 SATA

Just to be clear, the times also include shredding the data part of the files.

For my test I used 16 character file names and 100,000 files each 4k in size,
which comes to:
perchar: (1 data fsync + 16 name fsync) * 100,000 files = 1,700,000 fsync
once: (1 data fsync + 1 name fsync) * 100,000 files = 200,000 fsync
none: (1 data fsync + 0 name fsync) * 100,000 files = 100,000 fsync

I included the exact script I used to generate those statistics in a previous
email.  Feel free to replicate my experiment on your own equipment, using my
patched version of shred of course.

Alternatively, if you still have reservations about adopting my patch,
would you be more open to a --no-wipename option?  This would be the
equivalent of my proposed --wipename=none.  It would not imply any
additional security; to the contrary, it implies less security.  Yet, it
would still give me the optional performance boost I am trying to
achieve.
Yes these sync latencies really add up.

I timed this simple test script on ext4 on an SSD
and traditional disk in my laptop:

   import os
   d=os.open(".", os.O_DIRECTORY|os.O_RDONLY)
   for i in range(1000):
     os.fdatasync(d)

That gave 2ms and 12ms per sync operation respectively.
This seems to be independent of dir size and whether any
changes were made to the dir, which is a bit surprising.
Seems like there could be only sync on change optimizations possible.
Anyway...

So with the extra 1.6M syncs above on spinning rust,
that would add an extra 5.3 hours by my calc.
Your latencies seemed to be nearly double that,
but fair enough, same ball park.

Now we could handle this outside of shred, if we only
wanted to choose between wiping names and simple delete.
Given that the above latencies, the overhead of a couple
of microseconds to start a process per file is insignificant:

find /files | xargs -n1 -I{} sh -c 'shred "{}" && rm "{}"'

But yes this is a bit awkward.
Also if you did want to select wipe, but avoid the explicit syncs,
because you knew your file system had synchronous metadata updates
then we couldn't support that operation with this scheme.

So I'm leaning a bit towards adding control through shred options.
So how about this interface:

-u, --remove[=HOW]
     truncate and remove file after overwriting.
     HOW indicates how to remove the directory entry:
     unlink => just call unlink, wipe => also first obfuscate the name,
     wipesync => also sync each obfuscated character to disk (the default)

thanks,
Pádraig.


Sorry for not getting back to you sooner. Stuff came up, and I lost a lot of free time.

I'll try to integrate your suggestions into my patch this weekend. Thanks for your patience and willingness to hear my case.

Joseph D. Wagner



reply via email to

[Prev in Thread] Current Thread [Next in Thread]