qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v4 06/10] block/nbd-client: move from quit to st


From: Eric Blake
Subject: Re: [Qemu-block] [PATCH v4 06/10] block/nbd-client: move from quit to state
Date: Wed, 16 Jan 2019 10:25:03 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0

[adding Dan]

On 7/31/18 12:30 PM, Vladimir Sementsov-Ogievskiy wrote:
> To implement reconnect we need several states for the client:
> CONNECTED, QUIT and two CONNECTING states. CONNECTING states will
> be realized in the following patches. This patch implements CONNECTED
> and QUIT.
> 
> QUIT means, that we should close the connection and fail all current
> and further requests (like old quit = true).
> 
> CONNECTED means that connection is ok, we can send requests (like old
> quit = false).
> 
> For receiving loop we use a comparison of the current state with QUIT,
> because reconnect will be in the same loop, so it should be looping
> until the end.
> 
> Opposite, for requests we use a comparison of the current state with
> CONNECTED, as we don't want to send requests in CONNECTING states (
> which are unreachable now, but will be reachable after the following
> commits)
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
> ---
>  block/nbd-client.h |  9 ++++++++-
>  block/nbd-client.c | 55 
> ++++++++++++++++++++++++++++++++----------------------
>  2 files changed, 41 insertions(+), 23 deletions(-)

Dan just recently proposed patches to SocketChardev in general to use a
state machine that distinguishes between connecting and connected:

https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg03339.html

I'm wondering how much of his work is related or can be reused to get
restartable connections on NBD sockets?

Remember, right now, the NBD code always starts in blocking mode, and
does single-threaded handshaking until it is ready for transmission,
then switches to non-blocking mode for all subsequent transmissions (so,
for example, servicing a read request can assume that the socket is
valid without further waiting).  But once we start allowing reconnects,
a read request will need to detect when one socket has gone down, and
wait for its replacement socket to come back up, in order to retry the
request; this retry is in a context where we are in non-blocking
context, but the retry must establish a new socket, and possibly convert
the socket into TLS mode, all before being ready to retry the read request.

> 
> diff --git a/block/nbd-client.h b/block/nbd-client.h
> index 2f047ba614..5367425774 100644
> --- a/block/nbd-client.h
> +++ b/block/nbd-client.h
> @@ -23,6 +23,13 @@ typedef struct {
>      bool receiving;         /* waiting for read_reply_co? */
>  } NBDClientRequest;
>  
> +typedef enum NBDClientState {
> +    NBD_CLIENT_CONNECTING_WAIT,
> +    NBD_CLIENT_CONNECTING_NOWAIT,

Would we be better off adding these enum values in the later patch that
uses them?

> +    NBD_CLIENT_CONNECTED,
> +    NBD_CLIENT_QUIT
> +} NBDClientState;
> +
>  typedef struct NBDClientSession {
>      QIOChannelSocket *sioc; /* The master data channel */
>      QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */
> @@ -32,10 +39,10 @@ typedef struct NBDClientSession {
>      CoQueue free_sema;
>      Coroutine *read_reply_co;
>      int in_flight;
> +    NBDClientState state;
>  
>      NBDClientRequest requests[MAX_NBD_REQUESTS];
>      NBDReply reply;
> -    bool quit;
>  } NBDClientSession;
>  
>  NBDClientSession *nbd_get_client_session(BlockDriverState *bs);
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index 7eaf0149f0..a91fd3ea3e 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -34,6 +34,12 @@
>  #define HANDLE_TO_INDEX(bs, handle) ((handle) ^ (uint64_t)(intptr_t)(bs))
>  #define INDEX_TO_HANDLE(bs, index)  ((index)  ^ (uint64_t)(intptr_t)(bs))
>  
> +/* @ret would be used for reconnect in future */

s/would/will/

> +static void nbd_channel_error(NBDClientSession *s, int ret)
> +{
> +    s->state = NBD_CLIENT_QUIT;
> +}
> +
>  static void nbd_recv_coroutines_wake_all(NBDClientSession *s)
>  {
>      int i;
> @@ -73,14 +79,15 @@ static coroutine_fn void nbd_read_reply_entry(void 
> *opaque)
>      int ret = 0;
>      Error *local_err = NULL;
>  
> -    while (!s->quit) {
> +    while (s->state != NBD_CLIENT_QUIT) {
>          assert(s->reply.handle == 0);
>          ret = nbd_receive_reply(s->ioc, &s->reply, &local_err);
>          if (local_err) {
>              error_report_err(local_err);
>          }
>          if (ret <= 0) {
> -            break;
> +            nbd_channel_error(s, ret ? ret : -EIO);
> +            continue;

I guess the continue instead of the break is pre-supposing that
nbd_channel_error() might be able to recover in later patches?  But for
this patch, there is no change in control flow, because the loop
condition is met for no further iterations, the same as a break would
have done.

The rest of the patch looks sane, but fails to apply easily for me (I'm
getting enough rebase churn, that it's getting harder to state if it is
accurate against the latest git master).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]