Re: [Qemu-block] [PATCH 14/17] block: optimize access to reqs_lock

From: Paolo Bonzini
Subject: Re: [Qemu-block] [PATCH 14/17] block: optimize access to reqs_lock
Date: Fri, 5 May 2017 12:45:38 +0200
On 05/05/2017 12:25, Stefan Hajnoczi wrote:
> On Thu, May 04, 2017 at 06:06:39PM +0200, Paolo Bonzini wrote:
>> On 04/05/2017 16:59, Stefan Hajnoczi wrote:
>>> On Thu, Apr 20, 2017 at 02:00:55PM +0200, Paolo Bonzini wrote:
>>>> Hot path reqs_lock critical sections are very small; the only large 
>>>> critical
>>>> sections happen when a request waits for serialising requests, and these
>>>> should never happen in usual circumstances.
>>>> We do not want these small critical sections to yield in any case,
>>>> which calls for using a spinlock while writing the list.
>>> Is this patch purely an optimization?
>> Yes, it is, and pretty much a no-op until we have true multiqueue.  But
>> I expect it to have a significant effect for multiqueue.
>>> I'm hesitant about using spinlocks in userspace.  There are cases where
>>> the thread is descheduled that are beyond our control.  Nested virt will
>>> probably make things worse.  People have been optimizing and trying
>>> paravirt approaches to kernel spinlocks for these reasons for years.
>> This is true, but here we're talking about a 5-10 instruction window for
>> preemption; it matches the usage of spinlocks in other parts of QEMU.
> Only util/qht.c uses spinlocks, it's not a widely used primitive.

Right, but the idea is the same---very short, heavy and
performance-critical cases use spinlocks.  (util/qht.c is used heavily
in TCG mode).

>> It is efficient when there is no contention, but when there is, the
>> latency goes up by several orders of magnitude.
> Doesn't glibc spin for a while before waiting on the futex?  i.e. the
> best of both worlds.

You have to specify that manually with pthread_mutexattr_settype(...,
PTHRED_MUTEX_ADAPTIVE_NP).  It is not enabled by default because IIUC
the adaptive one doesn't support pthread_mutex_timedlock.


