[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v3 2/5] util: introduce threaded workqueue
From: |
Emilio G. Cota |
Subject: |
Re: [Qemu-devel] [PATCH v3 2/5] util: introduce threaded workqueue |
Date: |
Mon, 26 Nov 2018 13:49:19 -0500 |
User-agent: |
Mutt/1.9.4 (2018-02-28) |
On Mon, Nov 26, 2018 at 16:06:37 +0800, Xiao Guangrong wrote:
> > > + /* after the user fills the request, the bit is flipped. */
> > > + uint64_t request_fill_bitmap QEMU_ALIGNED(SMP_CACHE_BYTES);
> > > + /* after handles the request, the thread flips the bit. */
> > > + uint64_t request_done_bitmap QEMU_ALIGNED(SMP_CACHE_BYTES);
> >
> > Use DECLARE_BITMAP, otherwise you'll get type errors as David
> > pointed out.
>
> If we do it, the field becomes a pointer... that complicates the
> thing.
Not necessarily, see below.
On Mon, Nov 26, 2018 at 16:18:24 +0800, Xiao Guangrong wrote:
> On 11/24/18 8:17 AM, Emilio G. Cota wrote:
> > On Thu, Nov 22, 2018 at 15:20:25 +0800, address@hidden wrote:
> > > +static uint64_t get_free_request_bitmap(Threads *threads, ThreadLocal
> > > *thread)
> > > +{
> > > + uint64_t request_fill_bitmap, request_done_bitmap, result_bitmap;
> > > +
> > > + request_fill_bitmap = atomic_rcu_read(&thread->request_fill_bitmap);
> > > + request_done_bitmap = atomic_rcu_read(&thread->request_done_bitmap);
> > > + bitmap_xor(&result_bitmap, &request_fill_bitmap,
> > > &request_done_bitmap,
> > > + threads->thread_requests_nr);
> >
> > This is not wrong, but it's a big ugly. Instead, I would:
> >
> > - Introduce bitmap_xor_atomic in a previous patch
> > - Use bitmap_xor_atomic here, getting rid of the rcu reads
>
> Hmm, however, we do not need atomic xor operation here... that should be
> slower than
> just two READ_ONCE calls.
If you use DECLARE_BITMAP, you get an in-place array. On a 64-bit
host, that'd be
unsigned long foo[1]; /* [2] on 32-bit */
Then again on 64-bit hosts, bitmap_xor_atomic would reduce
to 2 atomic reads:
static inline void bitmap_xor_atomic(unsigned long *dst,
const unsigned long *src1, const unsigned long *src2, long nbits)
{
if (small_nbits(nbits)) {
*dst = atomic_read(src1) ^ atomic_read(&src2);
} else {
slow_bitmap_xor_atomic(dst, src1, src2, nbits);
}
}
So you can either do the above, or just define an unsigned long
instead of a u64 and keep doing what you're doing in this series,
but bearing in mind that the max on 32-bit hosts will be 32. But
that's no big deal since those machines won't have many cores
anyway.
Emilio
- Re: [Qemu-devel] [PATCH v3 1/5] bitops: introduce change_bit_atomic, (continued)
Re: [Qemu-devel] [PATCH v3 2/5] util: introduce threaded workqueue, Emilio G. Cota, 2018/11/23
Re: [Qemu-devel] [PATCH v3 2/5] util: introduce threaded workqueue, Emilio G. Cota, 2018/11/23
Re: [Qemu-devel] [PATCH v3 2/5] util: introduce threaded workqueue, Christophe de Dinechin, 2018/11/27
[Qemu-devel] [PATCH v3 3/5] migration: use threaded workqueue for compression, guangrong . xiao, 2018/11/22