qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread


From: Peter Xu
Subject: Re: [Qemu-devel] [RFC 0/6] monitor: allow per-monitor thread
Date: Tue, 22 Aug 2017 13:59:03 +0800
User-agent: Mutt/1.5.24 (2015-08-30)

On Tue, Aug 22, 2017 at 12:15:19PM +0800, Fam Zheng wrote:
> On Tue, 08/22 10:56, Peter Xu wrote:
> > I haven't really encountered (c), but I think it's the migrate_cancel
> > command that matters, which should not need BQL as well.
> 
> There is bdrv_invalidate_cache_all() in migrate_cancel which clearly isn't 
> safe.
> Is that if block unreachable in this case? If so we should assert, otherwise
> this command is not okay to run without BQL.

Ah. I see.  Even if so, if that is the only usage of BQL, IMHO we can
still mark migrate_cancel as "without-bql=true", instead we take the
BQL before calling bdrv_invalidate_cache_all().  Then migrate_cancel
can be BQL-free at least when block migration is not active.

> 
> Generically, what guarantee the thread-safety of a qmp command when you decide
> BQL is not needed? In other words, how do you prove commands are safe without
> BQL? I think almost every command accesses global state, but lock-free data
> structures are rare AFAICT.

I would suggest we split the problem into at least three parts.  IMHO
we need to answer below questions one by one to know what we should do
next:

1. whether we can handle monitor commands outside iothread, or say, in
   an isolated thread?

   This is basically what patch 2 does, the "per-monitor threads".

   IMHO this is the very first question to ask.  So now I know that at
   least current code cannot do it.  We need to at least do something
   to remove/replace the assertion to make this happen.  Can we?  I
   don't really know the answer yet.  If this is undoable, we can skip
   question 2/3 below and may need to rethink on how to solve the
   problem that postcopy recovery encounters.

2. whether there is any monitor commands can run without BQL?

   This is basically what patch 3/5 does, one for QMP, one for HMP.

   If we can settle question 1, then we can possibly start consider
   this question.  This step does not really allow any command to run
   without BQL, but we need to know whether it's possible in general,
   and if possible, we provide a framework to allow QMP/HMP developers
   to specify that.  If you see patch 3/5, the default behavior is
   still taking the BQL for all commands.

   IMHO doing this whole thing is generally good in the sense that
   this is actually forcing ourselves to break the BQL into smaller
   locks.  Take the migration commands for example: migrate_incoming
   do not need BQL, and when we write codes around it we know that we
   don't need to think about thread-safety.  That's not good IMHO.  I
   think it's time we should start consider thread-safety always.
   Again, for migrate_incoming to do this, actually we'll possibly at
   least need a migration management lock (the smaller lock) to make
   sure e.g. the user is not running two migrate_incoming commands in
   parallel (after per-monitor threads, it can happen).  But it's
   better than BQL, because BQL is for sure too big, so even a guest
   page access (as long as it held the BQL) can block migration
   commands.

3. which monitor commands can be run without BQL?

   This is what patch 4/6 was doing.  It tries to move
   migrate_incoming command out as the first candidate BQL-free
   command.

   Yes it's hard to say which command can be run without BQL.  So we
   need to investigate, possibly modify existing codes to make sure
   it's thread-safe, prove validity, then we can add the new ones into
   the BQL-free list.

   If after evaluating the pros and cons, we found that one command
   can be put into BQL-free but not worth the time for working on it,
   we can also keep those commands under BQL.

I assume question 3 is the one you were asking, and I'd say we may
need to solve question 1/2 first.  If we are done with 1/2, we just
need to spend time on each command to prove whether it is doable to
let that command run without BQL, and whether it worths itself to move
the command out of BQL.  Then we decide.  Thanks,

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]