[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v5 00/18] crypto: add afalg-backend support
From: |
Daniel P. Berrange |
Subject: |
Re: [Qemu-devel] [PATCH v5 00/18] crypto: add afalg-backend support |
Date: |
Fri, 14 Jul 2017 14:04:28 +0100 |
User-agent: |
Mutt/1.8.3 (2017-05-23) |
On Fri, Jul 14, 2017 at 07:38:22AM -0400, address@hidden wrote:
> From: "Longpeng(Mike)" <address@hidden>
>
> The AF_ALG socket family is the userspace interface for linux
> crypto API, users can use it to access hardware accelerators.
>
> This patchset adds a afalg-backend for qemu crypto subsystem. Currently
> when performs encrypt/decrypt, we'll try afalg-backend first and will
> back to libiary-backend if it failed.
>
> In the next step, It would support a command parameter to specifies
> which backends prefer to and some other improvements.
>
> I measured the performance about the afalg-backend impls, I tested
> how many data could be encrypted in 5 seconds.
>
> NOTE: If we use specific hardware crypto cards, I think afalg-backend
> would even faster.
>
> test-environment: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
>
> *sha256*
> chunk_size(bytes) MB/sec(afalg:sha256-ssse3) MB/sec(nettle)
> 512 93.03 185.87
> 1024 146.32 201.78
> 2048 213.32 210.93
> 4096 275.48 215.26
> 8192 321.77 217.49
> 16384 349.60 219.26
> 32768 363.59 219.73
> 65536 375.79 219.99
>
> *hmac(sha256)*
> chunk_size(bytes) MB/sec(afalg:sha256-ssse3) MB/sec(nettle)
> 512 71.26 165.55
> 1024 117.43 189.15
> 2048 180.96 203.24
> 4096 247.60 211.38
> 8192 301.99 215.65
> 16384 340.79 218.22
> 32768 365.51 219.49
> 65536 377.92 220.24
>
> *cbc(aes128)*
> chunk_size(bytes) MB/sec(afalg:cbc-aes-aesni) MB/sec(nettle)
> 512 371.76 188.41
> 1024 559.86 189.64
> 2048 768.66 192.11
> 4096 939.15 192.40
> 8192 1029.48 192.49
> 16384 1072.79 190.52
> 32768 1109.38 190.41
> 65536 1102.38 190.40
So I've attempted to replicate these results, and see totally
different outcome. NB, I hacked your code so that setting
QEMU_DISABLE_AF_ALG=1 would skip the af-alg impl. The results
I get are:
$ tests/benchmark-crypto-hash --quiet
sha256: Testing chunk_size 512 bytes done: 197.31 MB in 5.00 secs: 39.46 MB/sec
sha256: Testing chunk_size 1024 bytes done: 337.03 MB in 5.00 secs: 67.41 MB/sec
sha256: Testing chunk_size 2048 bytes done: 516.27 MB in 5.00 secs: 103.25
MB/sec
sha256: Testing chunk_size 4096 bytes done: 675.18 MB in 5.00 secs: 135.04
MB/sec
sha256: Testing chunk_size 8192 bytes done: 837.73 MB in 5.00 secs: 167.55
MB/sec
sha256: Testing chunk_size 16384 bytes done: 946.78 MB in 5.00 secs: 189.35
MB/sec
sha256: Testing chunk_size 32768 bytes done: 1008.56 MB in 5.00 secs: 201.71
MB/sec
sha256: Testing chunk_size 65536 bytes done: 1037.19 MB in 5.00 secs: 207.43
MB/sec
$ QEMU_DISABLE_AF_ALG=1 tests/benchmark-crypto-hash --quiet
sha256: Testing chunk_size 512 bytes done: 1099.92 MB in 5.00 secs: 219.98
MB/sec
sha256: Testing chunk_size 1024 bytes done: 1223.40 MB in 5.00 secs: 244.68
MB/sec
sha256: Testing chunk_size 2048 bytes done: 1304.04 MB in 5.00 secs: 260.81
MB/sec
sha256: Testing chunk_size 4096 bytes done: 1339.29 MB in 5.00 secs: 267.86
MB/sec
sha256: Testing chunk_size 8192 bytes done: 1359.68 MB in 5.00 secs: 271.94
MB/sec
sha256: Testing chunk_size 16384 bytes done: 1363.58 MB in 5.00 secs: 272.71
MB/sec
sha256: Testing chunk_size 32768 bytes done: 1364.66 MB in 5.00 secs: 272.93
MB/sec
sha256: Testing chunk_size 65536 bytes done: 1326.56 MB in 5.00 secs: 265.30
MB/sec
==> AF_ALG is slower in every case, by as much as x4
$ tests/benchmark-crypto-hmac --quiet
hmac(sha256): Testing chunk_size 512 bytes done: 173.83 MB in 5.00 secs: 34.77
MB/sec
hmac(sha256): Testing chunk_size 1024 bytes done: 302.32 MB in 5.00 secs: 60.46
MB/sec
hmac(sha256): Testing chunk_size 2048 bytes done: 469.93 MB in 5.00 secs: 93.99
MB/sec
hmac(sha256): Testing chunk_size 4096 bytes done: 648.27 MB in 5.00 secs:
129.65 MB/sec
hmac(sha256): Testing chunk_size 8192 bytes done: 800.80 MB in 5.00 secs:
160.16 MB/sec
hmac(sha256): Testing chunk_size 16384 bytes done: 887.09 MB in 5.00 secs:
177.42 MB/sec
hmac(sha256): Testing chunk_size 32768 bytes done: 932.09 MB in 5.00 secs:
186.41 MB/sec
hmac(sha256): Testing chunk_size 65536 bytes done: 1013.25 MB in 5.00 secs:
202.64 MB/sec
$ QEMU_DISABLE_AF_ALG=1 tests/benchmark-crypto-hmac --quiet
hmac(sha256): Testing chunk_size 512 bytes done: 751.36 MB in 5.00 secs: 150.27
MB/sec
hmac(sha256): Testing chunk_size 1024 bytes done: 961.43 MB in 5.00 secs:
192.29 MB/sec
hmac(sha256): Testing chunk_size 2048 bytes done: 1110.92 MB in 5.00 secs:
222.18 MB/sec
hmac(sha256): Testing chunk_size 4096 bytes done: 1225.78 MB in 5.00 secs:
245.16 MB/sec
hmac(sha256): Testing chunk_size 8192 bytes done: 1300.52 MB in 5.00 secs:
260.10 MB/sec
hmac(sha256): Testing chunk_size 16384 bytes done: 1327.00 MB in 5.00 secs:
265.40 MB/sec
hmac(sha256): Testing chunk_size 32768 bytes done: 1345.72 MB in 5.00 secs:
269.14 MB/sec
hmac(sha256): Testing chunk_size 65536 bytes done: 1348.50 MB in 5.00 secs:
269.69 MB/sec
==> AF_ALG is slower in every case, by as much as x4
$ tests/benchmark-crypto-cipher --quiet
cbc(aes128): Testing chunk_size 512 bytes done: 1571.74 MB in 5.00 secs: 314.35
MB/sec
cbc(aes128): Testing chunk_size 1024 bytes done: 2436.54 MB in 5.00 secs:
487.31 MB/sec
cbc(aes128): Testing chunk_size 2048 bytes done: 3412.53 MB in 5.00 secs:
682.50 MB/sec
cbc(aes128): Testing chunk_size 4096 bytes done: 4307.00 MB in 5.00 secs:
861.40 MB/sec
cbc(aes128): Testing chunk_size 8192 bytes done: 4854.20 MB in 5.00 secs:
970.84 MB/sec
cbc(aes128): Testing chunk_size 16384 bytes done: 5180.72 MB in 5.00 secs:
1036.14 MB/sec
cbc(aes128): Testing chunk_size 32768 bytes done: 5390.25 MB in 5.00 secs:
1078.05 MB/sec
cbc(aes128): Testing chunk_size 65536 bytes done: 5427.94 MB in 5.00 secs:
1085.59 MB/sec
$ QEMU_DISABLE_AF_ALG=1 tests/benchmark-crypto-cipher --quiet
cbc(aes128): Testing chunk_size 512 bytes done: 4204.65 MB in 5.00 secs: 840.93
MB/sec
cbc(aes128): Testing chunk_size 1024 bytes done: 4362.01 MB in 5.00 secs:
872.40 MB/sec
cbc(aes128): Testing chunk_size 2048 bytes done: 4347.91 MB in 5.00 secs:
869.58 MB/sec
cbc(aes128): Testing chunk_size 4096 bytes done: 4432.54 MB in 5.00 secs:
886.51 MB/sec
cbc(aes128): Testing chunk_size 8192 bytes done: 4416.47 MB in 5.00 secs:
883.29 MB/sec
cbc(aes128): Testing chunk_size 16384 bytes done: 4469.45 MB in 5.00 secs:
893.89 MB/sec
cbc(aes128): Testing chunk_size 32768 bytes done: 4454.56 MB in 5.00 secs:
890.91 MB/sec
cbc(aes128): Testing chunk_size 65536 bytes done: 4518.50 MB in 5.00 secs:
903.70 MB/sec
=> AF_ALG is slower until chunk_size is 8192 or larger.
I of course don't have the same CPU as you, but it is a representative
current model Intel(R) Core(TM) i7-6820HQ CPU @ 2.70GHz
I can, however, imagine that there are scenarios where this is faster,
particularly if using this in an embedded scenario with a relatively
low perf main CPU, but a hardware accelerator available.
Based on this though, I'm very reluctant to enable AF_ALG by default
when building QEMU, because I think it'll likely cause a major perf
regression for the common case of people with fast CPUs and no
hardware accelerator.
I think in the immediate term we should add a switch to configure
--enable-crypto-afalg, that must be opt-in when building QEMU,
so those people who know they have good hardware accelerator
present can use it, but in the general case we avoid it.
For the general case, I think we need to figure out how to make
direct use of CPU insturctions for crypto, eg Intel aesni. This
might be possible by using GNUTLS for ciphers (though it lacks
coverage for all the combinations we want)
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
- [Qemu-devel] [PATCH v5 10/18] crypto: hmac: introduce qcrypto_hmac_ctx_new for glib-backend, (continued)
- [Qemu-devel] [PATCH v5 10/18] crypto: hmac: introduce qcrypto_hmac_ctx_new for glib-backend, longpeng . mike, 2017/07/14
- [Qemu-devel] [PATCH v5 12/18] crypto: introduce some common functions for af_alg backend, longpeng . mike, 2017/07/14
- [Qemu-devel] [PATCH v5 11/18] crypto: hmac: add hmac driver framework, longpeng . mike, 2017/07/14
- [Qemu-devel] [PATCH v5 13/18] crypto: cipher: add afalg-backend cipher support, longpeng . mike, 2017/07/14
- [Qemu-devel] [PATCH v5 14/18] crypto: hash: add afalg-backend hash support, longpeng . mike, 2017/07/14
- [Qemu-devel] [PATCH v5 15/18] crypto: hmac: add af_alg-backend hmac support, longpeng . mike, 2017/07/14
- [Qemu-devel] [PATCH v5 16/18] tests: crypto: add cipher speed benchmark support, longpeng . mike, 2017/07/14
- [Qemu-devel] [PATCH v5 17/18] tests: crypto: add hash speed benchmark support, longpeng . mike, 2017/07/14
- [Qemu-devel] [PATCH v5 18/18] tests: crypto: add hmac speed benchmark support, longpeng . mike, 2017/07/14
- [Qemu-devel] [PATCH v5 00/18] crypto: add afalg-backend support, longpeng . mike, 2017/07/14
- Re: [Qemu-devel] [PATCH v5 00/18] crypto: add afalg-backend support,
Daniel P. Berrange <=