qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] migration: optimize the downtime


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH] migration: optimize the downtime
Date: Mon, 24 Jul 2017 22:38:38 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

On 24/07/2017 21:03, Dr. David Alan Gilbert wrote:
>> I don't like having such a long-lived mutex (it seems like a recipe for
>> deadlocks with the BQL), plus memory_region_transaction_commit (the
>> expensive part of memory_global_dirty_log_stop) needs to be under the
>> BQL itself because it calls MemoryListeners.
>>
>> Maybe memory_global_dirty_log_stop can delay itself to the next vm_start
>> if it's called while runstate_running() returns false (which should be
>> always the case)?
>>
>> It could even be entirely enclosed within memory.c if you do it with a
>> VMChangeStateHandler.
> This still causes the BQL to be held for quite a while; albeit at a less
> critical point.
> 
> In this and the existing case we don't actually need efficiency - what we 
> need is just
> not to be holding onto the BQL for so long;  could we do a less
> efficient commit here, removing one region at a time, yielding the lock
> and retaking it?

I'm worried that if you release the lock and retake it you open a window
where someone else could start a commit, and
memory_global_dirty_log_stop's commit would use outdated information.

The problem is that a coarse lock, such as the BQL, is very hard to
break into another lock with long critical sections (a "memory map
commit lock" in this case).  It's obviously very easy to say "the new
lock nests inside the BQL"... but then, because you created the lock to
move things out of the BQL, any callback inside the new lock cannot take
the memory map commit lock, and you need to make entire subsystems
thread-safe.  If the new lock nests outside the BQL instead, it's
simpler to avoid deadlocks, but you need to be careful about releasing
the BQL and any reentrancy implications of doing that.

One strategy that worked well so far has been to use two locks, one of
them being the BQL, where you can take either lock for reads and both
for writes.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]