[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 0/7] Improve buffer_is_zero
From: |
Vijay Kilari |
Subject: |
Re: [Qemu-devel] [PATCH 0/7] Improve buffer_is_zero |
Date: |
Thu, 25 Aug 2016 13:34:36 +0530 |
On Thu, Aug 25, 2016 at 12:07 PM, Vijay Kilari <address@hidden> wrote:
> Hi Richard,
>
> Migration fails on arm64 with these patches.
> On the destination VM, follow errors are appearing.
>
> qemu-system-aarch64: VQ 0 size 0x400 Guest index 0x0 inconsistent with
> Host index 0x1937: delta 0xe6c9
> qemu-system-aarch64: error while loading state for instance 0x0 of
> device 'address@hidden/virtio-net'
> qemu-system-aarch64: load of migration failed: Operation not permitted
> qemu-system-aarch64: network script /etc/qemu-ifdown failed with status 256
With below changes, migration is working fine on arm64.
diff --git a/util/cutils.c b/util/cutils.c
index 30fac02..9bbf31f 100644
--- a/util/cutils.c
+++ b/util/cutils.c
@@ -170,6 +170,7 @@ static bool __attribute__((noinline))
\
NAME(const void *buf, size_t len) \
{ \
const void *end = buf + len; \
+ const VECTYPE zero = (VECTYPE){0}; \
do { \
const VECTYPE *p = buf; \
VECTYPE t; \
@@ -185,7 +186,7 @@ NAME(const void *buf, size_t len)
\
} else { \
link_error(); \
} \
- if (unlikely(!ZERO(t))) { \
+ if (unlikely(!ZERO(t, zero))) { \
return false; \
} \
buf += SIZE; \
@@ -227,7 +228,7 @@ buffer_zero_base(const void *buf, size_t len)
return true;
}
-#define IDENT_ZERO(X) (X)
+#define IDENT_ZERO(X1, X2) (X1 == X2)
ACCEL_BUFFER_ZERO(buffer_zero_int, 4*sizeof(long), long, IDENT_ZERO)
static bool select_accel_int(const void *buf, size_t len)
@@ -511,7 +512,9 @@ static bool select_accel_fn(const void *buf, size_t len)
#elif defined(__aarch64__)
#include "arm_neon.h"
-#define DO_ZERO(X) (vgetq_lane_u64((X), 0) | vgetq_lane_u64((X), 1))
+#define DO_ZERO(X1, X2) \
+ ((vgetq_lane_u64(X1, 0) == vgetq_lane_u64(X2, 0)) && \
+ (vgetq_lane_u64(X1, 1) == vgetq_lane_u64(X2, 1)))
ACCEL_BUFFER_ZERO(buffer_zero_neon_64, 64, uint64x2_t, DO_ZERO)
ACCEL_BUFFER_ZERO(buffer_zero_neon_128, 128, uint64x2_t, DO_ZERO)
@@ -526,7 +529,7 @@ static void __attribute__((constructor))
init_buffer_zero_accel(void)
since the later is not available to userspace. This seems
to work in practice for existing implementations. */
asm("mrs %0, dczid_el0" : "=r"(t));
- if ((t & 15) * 16 >= 128) {
+ if (pow(2, (t & 0xf)) * 4 >= 128) {
buffer_zero_line_mask = 128 - 1;
buffer_zero_accel = buffer_zero_neon_128;
} else {
>
> Regards
> Vijay
>
>
> On Wed, Aug 24, 2016 at 9:47 AM, Richard Henderson <address@hidden> wrote:
>> Patches 1-3 remove the use of ifunc from the implementation.
>>
>> Patch 5 adjusts the x86 implementation a bit more to take
>> advantage of ptest (in sse4.1) and unaligned accesses (in avx1).
>>
>> Patches 2 and 6 are the result of my conversation with Vijaya
>> Kumar with respect to ThunderX.
>>
>> Patch 7 is the result of seeing some really really horrible code
>> produced for ppc64le (gcc 4.9 and mainline).
>>
>> This has had limited testing. What I don't know is the best way
>> to benchmark this -- the only way I know to trigger this is via
>> the console, by hand, which doesn't make for reasonable timing.
>>
>>
>> r~
>>
>>
>> Richard Henderson (7):
>> cutils: Remove SPLAT macro
>> cutils: Export only buffer_is_zero
>> cutils: Rearrange buffer_is_zero acceleration
>> cutils: Add generic prefetch
>> cutils: Rewrite x86 buffer zero checking
>> cutils: Rewrite aarch64 buffer zero checking
>> cutils: Rewrite ppc buffer zero checking
>>
>> configure | 21 +-
>> include/qemu/cutils.h | 2 -
>> migration/ram.c | 2 +-
>> migration/rdma.c | 5 +-
>> util/cutils.c | 526
>> +++++++++++++++++++++++++++++++++-----------------
>> 5 files changed, 352 insertions(+), 204 deletions(-)
>>
>> --
>> 2.7.4
>>
Re: [Qemu-devel] [PATCH 0/7] Improve buffer_is_zero, Vijay Kilari, 2016/08/25
- Re: [Qemu-devel] [PATCH 0/7] Improve buffer_is_zero,
Vijay Kilari <=