[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detec
From: |
Pádraig Brady |
Subject: |
Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection |
Date: |
Fri, 23 Oct 2015 12:15:07 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 22/10/15 20:47, Paolo Bonzini wrote:
>
>
> On 22/10/2015 19:39, Radim Krčmář wrote:
>> 2015-10-22 18:14+0200, Paolo Bonzini:
>>> On 22/10/2015 18:02, Eric Blake wrote:
>>>> I see a bug in there:
>>>
>>> Of course. You shouldn't have told me what the bug was, I deserved
>>> to look for it myself. :)
>>
>> It rather seems that you don't want spoilers, :)
>>
>> I see two bugs now.
>
> Me too. :) But Rusty surely has some testcases in case he wants to
> adopt some of the ideas here. O:-)
For completeness this should address the bugs I think?
bool memeqzero4_paolo(const void *data, size_t length)
{
const unsigned char *p = data;
unsigned long word;
if (!length)
return true;
/* Check len bytes not aligned on a word. */
while (__builtin_expect(length & (sizeof(word) - 1), 0)) {
if (*p)
return false;
p++;
length--;
if (!length)
return true;
}
/* Check up to 16 bytes a word at a time. */
for (;;) {
memcpy(&word, p, sizeof(word));
if (word)
return false;
p += sizeof(word);
length -= sizeof(word);
if (!length)
return true;
if (__builtin_expect(length & 15, 0) == 0)
break;
}
/* Now we know that's zero, memcmp with self. */
return memcmp(data, p, length) == 0;
}
compiled with gcc 5.1.1 -march=native -O2 on an i3-2310M
we get these timings:
bytes 1 8 16 512 65536
---------------------------------------------
Rusty: 10 28 59 114 6510
Paolo: 9 9 12 75 6495
It's also smaller, especially at -O3:
$ nm -S a.out | grep memeqzero4
... 000000000000005b t memeqzero4_paolo
... 0000000000000063 t memeqzero4_rusty
$ gcc -march=native -O3 memeqzero.c
$ nm -S a.out | grep memeqzero4
... 000000000000005b t memeqzero4_paolo
... 0000000000000133 t memeqzero4_rusty
cheers,
Pádraig.
- [PATCH] copy,dd: simplify and optimize NUL bytes detection, Pádraig Brady, 2015/10/22
- Re: [PATCH] copy,dd: simplify and optimize NUL bytes detection, Eric Blake, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Paolo Bonzini, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Pádraig Brady, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Paolo Bonzini, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Eric Blake, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Paolo Bonzini, 2015/10/22
- Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection, Radim Krčmář, 2015/10/22
- Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection, Paolo Bonzini, 2015/10/22
- Message not available
- Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection, Paolo Bonzini, 2015/10/23
- Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection,
Pádraig Brady <=
- Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection, Pádraig Brady, 2015/10/23
- Re: [Qemu-devel] [PATCH] copy, dd: simplify and optimize NUL bytes detection, Pádraig Brady, 2015/10/25
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Bernhard Voelker, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Paolo Bonzini, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Eric Blake, 2015/10/22
- Re: [PATCH] copy, dd: simplify and optimize NUL bytes detection, Bernhard Voelker, 2015/10/23