|
From: | Cláudio Gil |
Subject: | Re: [Duplicity-talk] Verification of filelists, multiprocessing |
Date: | Thu, 28 Jan 2016 00:17:10 +0000 |
Em 27/01/2016 21:40, "Nate Eldredge" <address@hidden> escreveu:
>
> On Wed, 27 Jan 2016, Cláudio Gil wrote:
>
>> Hi,
>>
>> About pararelism, from what I could see, duplicity processes your files as
>> a stream, a long sequence of files. Encryption is also sequential by design
>> (that's why GnuPG, what duplicity uses, does not use threads).
>>
>> Maybe paralelization could be introduced in a few select points of
>> duplicity. But the part that is mostly CPU-bound is the encryption and
>> encrypting a file will use 1 core.
>>
>> Since the entire volume is encrypted and not the files in it, you could
>> test the difference between between backing up 250MB (10 volumes, by
>> default) with encryption and backing up the same files without encryption
>> and then encrypting all the volumes with multiple GnuPG processes (using
>> "parallel" or something).
>>
>> It's an interesting comparison. It's not obvious if it be faster and how
>> changing the volume size affects the result.
>
>
> I guess one could write a wrapper around gpg that slurps up the input data and exits, keeping gpg itself running on that data in the background, with some sort of simple IPC to coordinate a maximum number of gpg processes. This could be a useful thing to have in general. And then one wouldn't need to modify duplicity at all.
>
> Maybe a fun project in someone's "copious free time"...
I don't think that would work. Duplicity is not doing things in sequential and independent steps. All steps to create a volume are chained together (like a pipe). This means the encryption starts with the first byte to be backed up and ends when the volume is complete. If you created such a script I believe there would either be an error or it would work but duplicity would not call a second GnuPG anyway.
To run multiple GnuPG at the same time, and since volumes must be generated in sequence, you need to either 1) skip encryption until you have N volumes and then start multiple GnuPG or 2) buffer what you want to send to GnuPG in order to be able to advance to the next volume. Both options assume GnuPG is the bottleneck, option 1 requires more local disk space, and option 2 requires more memory.
Cheers,
Cláudio
[Prev in Thread] | Current Thread | [Next in Thread] |