duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] Verification of filelists, multiprocessing


From: Cláudio Gil
Subject: Re: [Duplicity-talk] Verification of filelists, multiprocessing
Date: Thu, 28 Jan 2016 00:17:10 +0000


Em 27/01/2016 21:40, "Nate Eldredge" <address@hidden> escreveu:
>
> On Wed, 27 Jan 2016, Cláudio Gil wrote:
>
>> Hi,
>>
>> About pararelism, from what I could see, duplicity processes your files as
>> a stream, a long sequence of files. Encryption is also sequential by design
>> (that's why GnuPG, what duplicity uses, does not use threads).
>>
>> Maybe paralelization could be introduced in a few select points of
>> duplicity. But the part that is mostly CPU-bound is the encryption and
>> encrypting a file will use 1 core.
>>
>> Since the entire volume is encrypted and not the files in it, you could
>> test the difference between between backing up 250MB (10 volumes, by
>> default) with encryption and backing up the same files without encryption
>> and then encrypting all the volumes with multiple GnuPG processes (using
>> "parallel" or something).
>>
>> It's an interesting comparison. It's not obvious if it be faster and how
>> changing the volume size affects the result.
>
>
> I guess one could write a wrapper around gpg that slurps up the input data and exits, keeping gpg itself running on that data in the background, with some sort of simple IPC to coordinate a maximum number of gpg processes. This could be a useful thing to have in general.  And then one wouldn't need to modify duplicity at all.
>
> Maybe a fun project in someone's "copious free time"...

I don't think that would work. Duplicity is not doing things in sequential and independent steps. All steps to create a volume are chained together (like a pipe). This means the encryption starts with the first byte to be backed up and ends when the volume is complete. If you created such a script I believe there would either be an error or it would work but duplicity would not call a second GnuPG anyway.

To run multiple GnuPG at the same time, and since volumes must be generated in sequence, you need to either 1) skip encryption until you have N volumes and then start multiple GnuPG or 2) buffer what you want to send to GnuPG in order to be able to advance to the next volume. Both options assume GnuPG is the bottleneck, option 1 requires more local disk space, and option 2 requires more memory.

Cheers,
Cláudio


reply via email to

[Prev in Thread] Current Thread [Next in Thread]