[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PULL 57/58] cutils: Add SSE4 version
From: |
Paolo Bonzini |
Subject: |
[Qemu-devel] [PULL 57/58] cutils: Add SSE4 version |
Date: |
Tue, 13 Sep 2016 19:16:28 +0200 |
Signed-off-by: Paolo Bonzini <address@hidden>
---
util/bufferiszero.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/util/bufferiszero.c b/util/bufferiszero.c
index 4af3caa..bafd3d1 100644
--- a/util/bufferiszero.c
+++ b/util/bufferiszero.c
@@ -114,6 +114,13 @@ ACCEL_BUFFER_ZERO(buffer_zero_sse2, 64, __m128i,
SSE2_NONZERO)
#ifdef CONFIG_AVX2_OPT
#pragma GCC push_options
+#pragma GCC target("sse4")
+#include <smmintrin.h>
+#define SSE4_NONZERO(X) !_mm_testz_si128((X), (X))
+ACCEL_BUFFER_ZERO(buffer_zero_sse4, 64, __m128i, SSE4_NONZERO)
+#pragma GCC pop_options
+
+#pragma GCC push_options
#pragma GCC target("avx2")
#include <immintrin.h>
#define AVX2_NONZERO(X) !_mm256_testz_si256((X), (X))
@@ -182,6 +189,9 @@ static bool select_accel_fn(const void *buf, size_t len)
if (len % 128 == 0 && ibuf % 32 == 0 && (cpuid_cache & CACHE_AVX2)) {
return buffer_zero_avx2(buf, len);
}
+ if (len % 64 == 0 && ibuf % 16 == 0 && (cpuid_cache & CACHE_SSE4)) {
+ return buffer_zero_sse4(buf, len);
+ }
#endif
if (len % 64 == 0 && ibuf % 16 == 0 && (cpuid_cache & CACHE_SSE2)) {
return buffer_zero_sse2(buf, len);
--
1.8.3.1
- [Qemu-devel] [PULL 47/58] optionrom: do not rely on compiler's bswap optimization, (continued)
- [Qemu-devel] [PULL 47/58] optionrom: do not rely on compiler's bswap optimization, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PULL 51/58] cutils: Remove SPLAT macro, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PULL 54/58] cutils: Remove aarch64 buffer zero checking, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PULL 49/58] ppc: do not redefine CPUPPCState, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PULL 53/58] cutils: Rearrange buffer_is_zero acceleration, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PULL 50/58] cutils: Move buffer_is_zero and subroutines to a new file, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PULL 52/58] cutils: Export only buffer_is_zero, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PULL 56/58] cutils: Add test for buffer_is_zero, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PULL 58/58] cutils: Add generic prefetch, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PULL 55/58] cutils: Remove ppc buffer zero checking, Paolo Bonzini, 2016/09/13
- [Qemu-devel] [PULL 57/58] cutils: Add SSE4 version,
Paolo Bonzini <=
- Re: [Qemu-devel] [PULL 00/58] First round of misc patches for QEMU 2.8, Peter Maydell, 2016/09/13