[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization
From: |
Li, Liang Z |
Subject: |
Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization |
Date: |
Thu, 12 Nov 2015 08:53:32 +0000 |
> On 12/11/2015 03:49, Li, Liang Z wrote:
> > I am very surprised about the live migration performance result when
> > I use your ' memeqzero4_paolo' instead of these SSE2 Intrinsics to
> > check the zero pages.
>
> What code were you using? Remember I suggested using only unsigned long
> checks, like
>
> unsigned long *p = ...
> if (p[0] || p[1] || p[2] || p[3]
> || memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
> return BUFFER_NOT_ZERO;
> else
> return BUFFER_ZERO;
>
I use the following code:
bool memeqzero4_paolo(const void *data, size_t length)
{
const unsigned char *p = data;
unsigned long word;
if (!length)
return true;
/* Check len bytes not aligned on a word. */
while (__builtin_expect(length & (sizeof(word) - 1), 0)) {
if (*p)
return false;
p++;
length--;
if (!length)
return true;
}
/* Check up to 16 bytes a word at a time. */
for (;;) {
memcpy(&word, p, sizeof(word));
if (word)
return false;
p += sizeof(word);
length -= sizeof(word);
if (!length)
return true;
if (__builtin_expect(length & 15, 0) == 0)
break;
}
/* Now we know that's zero, memcmp with self. */
return memcmp(data, p, length) == 0;
}
> > The total live migration time increased about
> > 8%! Not decreased. Although in the unit test your '
> > memeqzero4_paolo' has better performance, any idea?
>
> You only tested the case of zero pages. But real pages usually are not zero,
> even if they have a few zero bytes at the beginning. It's very important to
> optimize the initial check before the memcmp call.
>
In the unit test, I only test zero pages too, and the performance of
'memeqzero4_paolo' is better.
But when merged into QEMU, it caused performance drop. Why?
> Paolo
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, (continued)
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/10
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/11
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization,
Li, Liang Z <=
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Paolo Bonzini, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Juan Quintela, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Li, Liang Z, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Dr. David Alan Gilbert, 2015/11/12
- Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization, Eric Blake, 2015/11/12