[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] monitor/qmp: fix race with clients disconnecting early
From: |
Markus Armbruster |
Subject: |
Re: [PATCH] monitor/qmp: fix race with clients disconnecting early |
Date: |
Tue, 12 Oct 2021 11:27:26 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) |
Stefan, any thoughts on this?
Markus Armbruster <armbru@redhat.com> writes:
> I've thought a bit more.
>
> A monitor can serve a series of clients.
>
> Back when all of the monitor ran in the main thread, we completely
> finished serving the current client before we started serving the next
> one (I think). In other words, sessions did not overlap.
>
> Since we moved parts of the monitor to the monitor I/O thread (merge
> commit 4bdc24fa018), sessions can overlap, and this causes issues, as
> you demonstrated.
>
> Possible fixes:
>
> 1. Go back to "don't overlap" with suitable synchronization.
>
> I'm afraid this won't cut it, because exec-oob would have wait in
> line behind reconnect.
>
> It currently waits for other reasons, but that feels fixable. Going
> back to "don't overlap" would make it unfixable.
>
> 2. Make the lingering session not affect / be affected by the new
> session's state
>
> This is what your patch tries. Every access of session state needs
> to be guarded like
>
> if (conn_nr_before == qatomic_read(&mon->connection_nr)) {
> access session state
> } else {
> ???
> }
>
> Issues:
>
> * We have to find and guard all existing accesses. That's work.
>
> * We have to guard all future accesses. More work, and easy to
> forget, which makes this fix rather brittle.
>
> * The fix is spread over many places.
>
> * We may run into cases where the ??? part gets complicated.
> Consider file descriptors. The command in flight will have its
> monitor_get_fd() fail after disconnect. Probably okay, because it
> can also fail for other reasons. But we might run into cases where
> the ??? part creates new failure modes for us to handle.
>
> 3. Per-session state
>
> Move per-session state from monitor state into a separate object.
>
> Use reference counts to keep this object alive until both threads are
> done with the session.
>
> Monitor I/O thread executes monitor core and the out-of-band
> commands; its stops using the old session on disconnect, and starts
> using the new session on connect.
>
> Main thread executes in-band commands, and these use the session that
> submitted them.
>
> Do I make sense, or should I explain my idea in more detail?
- Re: [PATCH] monitor/qmp: fix race with clients disconnecting early,
Markus Armbruster <=