qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] bbc35f: nbd: silence maybe-uninitialized warn


From: Peter Maydell
Subject: [Qemu-commits] [qemu/qemu] bbc35f: nbd: silence maybe-uninitialized warnings
Date: Sun, 11 Oct 2020 06:45:27 -0700

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: bbc35fc20e6efcb9f177668c04ee05f25a0a2e65
      
https://github.com/qemu/qemu/commit/bbc35fc20e6efcb9f177668c04ee05f25a0a2e65
  Author: Christian Borntraeger <borntraeger@de.ibm.com>
  Date:   2020-10-09 (Fri, 09 Oct 2020)

  Changed paths:
    M nbd/server.c

  Log Message:
  -----------
  nbd: silence maybe-uninitialized warnings

gcc 10 from Fedora 32 gives me:

Compiling C object libblock.fa.p/nbd_server.c.o
../nbd/server.c: In function ‘nbd_co_client_start’:
../nbd/server.c:625:14: error: ‘namelen’ may be used uninitialized in this 
function [-Werror=maybe-uninitialized]
  625 |         rc = nbd_negotiate_send_info(client, NBD_INFO_NAME, namelen, 
name,
      |              
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  626 |                                      errp);
      |                                      ~~~~~
../nbd/server.c:564:14: note: ‘namelen’ was declared here
  564 |     uint32_t namelen;
      |              ^~~~~~~
cc1: all warnings being treated as errors

As I cannot see how this can happen, let uns silence the warning.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20200930155859.303148-3-borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: 8c517de24a8a1dcbeb54e7e12b5b0fda42a90ace
      
https://github.com/qemu/qemu/commit/8c517de24a8a1dcbeb54e7e12b5b0fda42a90ace
  Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
  Date:   2020-10-09 (Fri, 09 Oct 2020)

  Changed paths:
    M block/nbd.c

  Log Message:
  -----------
  block/nbd: fix drain dead-lock because of nbd reconnect-delay

We pause reconnect process during drained section. So, if we have some
requests, waiting for reconnect we should cancel them, otherwise they
deadlock the drained section.

How to reproduce:

1. Create an image:
   qemu-img create -f qcow2 xx 100M

2. Start NBD server:
   qemu-nbd xx

3. Start vm with second nbd disk on node2, like this:

  ./build/x86_64-softmmu/qemu-system-x86_64 -nodefaults -drive \
     file=/work/images/cent7.qcow2 -drive \
     
driver=nbd,server.type=inet,server.host=192.168.100.5,server.port=10809,reconnect-delay=60
 \
     -vnc :0 -m 2G -enable-kvm -vga std

4. Access the vm through vnc (or some other way?), and check that NBD
   drive works:

   dd if=/dev/sdb of=/dev/null bs=1M count=10

   - the command should succeed.

5. Now, kill the nbd server, and run dd in the guest again:

   dd if=/dev/sdb of=/dev/null bs=1M count=10

Now Qemu is trying to reconnect, and dd-generated requests are waiting
for the connection (they will wait up to 60 seconds (see
reconnect-delay option above) and than fail). But suddenly, vm may
totally hang in the deadlock. You may need to increase reconnect-delay
period to catch the dead-lock.

VM doesn't respond because drain dead-lock happens in cpu thread with
global mutex taken. That's not good thing by itself and is not fixed
by this commit (true way is using iothreads). Still this commit fixes
drain dead-lock itself.

Note: probably, we can instead continue to reconnect during drained
section. To achieve this, we may move negotiation to the connect thread
to make it independent of bs aio context. But expanding drained section
doesn't seem good anyway. So, let's now fix the bug the simplest way.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200903190301.367620-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: 8a509afd724671ffc066235e368ba7d81c9a6dd7
      
https://github.com/qemu/qemu/commit/8a509afd724671ffc066235e368ba7d81c9a6dd7
  Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
  Date:   2020-10-09 (Fri, 09 Oct 2020)

  Changed paths:
    M block/nbd.c

  Log Message:
  -----------
  block/nbd: correctly use qio_channel_detach_aio_context when needed

Don't use nbd_client_detach_aio_context() driver handler where we want
to finalize the connection. We should directly use
qio_channel_detach_aio_context() in such cases. Driver handler may (and
will) contain another things, unrelated to the qio channel.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200903190301.367620-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: 46f56631b5d18ae744782dbfe6fd17c3ebe15f7a
      
https://github.com/qemu/qemu/commit/46f56631b5d18ae744782dbfe6fd17c3ebe15f7a
  Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
  Date:   2020-10-09 (Fri, 09 Oct 2020)

  Changed paths:
    M block/nbd.c

  Log Message:
  -----------
  block/nbd: fix reconnect-delay

reconnect-delay has a design flaw: we handle it in the same loop where
we do connection attempt. So, reconnect-delay may be exceeded by
unpredictable time of connection attempt.

Let's instead use separate timer.

How to reproduce the bug:

1. Create an image on node1:
   qemu-img create -f qcow2 xx 100M

2. Start NBD server on node1:
   qemu-nbd xx

3. On node2 start qemu-io:

./build/qemu-io --image-opts \
driver=nbd,server.type=inet,server.host=192.168.100.5,server.port=10809,reconnect-delay=15

4. Type 'read 0 512' in qemu-io interface to check that connection
   works

Be careful: you should make steps 5-7 in a short time, less than 15
seconds.

5. Kill nbd server on node1

6. Run 'read 0 512' in qemu-io interface again, to be sure that nbd
client goes to reconnect loop.

7. On node1 run the following command

   sudo iptables -A INPUT -p tcp --dport 10809 -j DROP

This will make the connect() call of qemu-io at node2 take a long time.

And you'll see that read command in qemu-io will hang for a long time,
more than 15 seconds specified by reconnect-delay parameter. It's the
bug.

8. Don't forget to drop iptables rule on node1:

   sudo iptables -D INPUT -p tcp --dport 10809 -j DROP

Important note: Step [5] is necessary to reproduce _this_ bug. If we
miss step [5], the read command (step 6) will hang for a long time and
this commit doesn't help, because there will be not long connect() to
unreachable host, but long sendmsg() to unreachable host, which should
be fixed by enabling and adjusting keep-alive on the socket, which is a
thing for further patch set.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200903190301.367620-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: 99d72dba1c96c3a498d935a54081e226b262641a
      
https://github.com/qemu/qemu/commit/99d72dba1c96c3a498d935a54081e226b262641a
  Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
  Date:   2020-10-09 (Fri, 09 Oct 2020)

  Changed paths:
    M block/nbd.c

  Log Message:
  -----------
  block/nbd: nbd_co_reconnect_loop(): don't connect if drained

In a recent commit 12c75e20a269ac we've improved
nbd_co_reconnect_loop() to not make drain wait for additional sleep.
Similarly, we shouldn't try to connect, if previous sleep was
interrupted by drain begin, otherwise drain_begin will have to wait for
the whole connection attempt.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200903190301.367620-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: 029a88c9a7e3210ba565c081471bd44ba8d5e397
      
https://github.com/qemu/qemu/commit/029a88c9a7e3210ba565c081471bd44ba8d5e397
  Author: Eric Blake <eblake@redhat.com>
  Date:   2020-10-09 (Fri, 09 Oct 2020)

  Changed paths:
    M qemu-nbd.c

  Log Message:
  -----------
  qemu-nbd: Honor SIGINT and SIGHUP

Honoring just SIGTERM on Linux is too weak; we also want to handle
other common signals, and do so even on BSD.  Why?  Because at least
'qemu-nbd -B bitmap' needs a chance to clean up the in-use bit on
bitmaps when the server is shut down via a signal.

See also: http://bugzilla.redhat.com/1883608

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200930121105.667049-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: apply comment tweak suggested by Vladimir; fix ifdef around
termsig_handler]
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: d1e2c3e7bd22a99660b0c254fc05c020d0239ca0
      
https://github.com/qemu/qemu/commit/d1e2c3e7bd22a99660b0c254fc05c020d0239ca0
  Author: Eric Blake <eblake@redhat.com>
  Date:   2020-10-09 (Fri, 09 Oct 2020)

  Changed paths:
    M nbd/server.c

  Log Message:
  -----------
  nbd/server: Reject embedded NUL in NBD strings

The NBD spec is clear that any string sent from the client must not
contain embedded NUL characters.  If the client passes "a\0", we
should reject that option request rather than act on "a".

Testing this is not possible with a compliant client, but I was able
to use gdb to coerce libnbd into temporarily behaving as such a
client.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200930121105.667049-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: ebd57062a1957307a175a810441af87259d7dbe9
      
https://github.com/qemu/qemu/commit/ebd57062a1957307a175a810441af87259d7dbe9
  Author: Eric Blake <eblake@redhat.com>
  Date:   2020-10-09 (Fri, 09 Oct 2020)

  Changed paths:
    M nbd/server.c

  Log Message:
  -----------
  nbd: Simplify meta-context parsing

We had a premature optimization of trying to read as little from the
wire as possible while handling NBD_OPT_SET_META_CONTEXT in phases.
But in reality, we HAVE to read the entire string from the client
before we can get to the next command, and it is easier to just read
it all at once than it is to read it in pieces.  And once we do that,
several functions end up no longer performing I/O, so they can drop
length and errp parameters, and just return a bool instead of
modifying through a pointer.

Our iotests still pass; I also checked that libnbd's testsuite (which
covers more corner cases of odd meta context requests) still passes.
There are cases where the sequence of trace messages produced differs
(for example, when no bitmap is exported, a query for "qemu:" now
produces two trace lines instead of one), but trace points are for
debug and have no effect on what the client sees.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200930121105.667049-4-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: enhance commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>


  Commit: b433f2cb0115b11f74205a0cf75565976d4b2517
      
https://github.com/qemu/qemu/commit/b433f2cb0115b11f74205a0cf75565976d4b2517
  Author: Peter Maydell <peter.maydell@linaro.org>
  Date:   2020-10-09 (Fri, 09 Oct 2020)

  Changed paths:
    M block/nbd.c
    M nbd/server.c
    M qemu-nbd.c

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2020-10-08-v3' into 
staging

nbd patches for 2020-10-08

- silence compilation warnings
- more fixes to prevent reconnect hangs
- improve 'qemu-nbd' termination behavior
- cleaner NBD protocol compliance on string handling

# gpg: Signature made Fri 09 Oct 2020 21:05:30 BST
# gpg:                using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg:                 aka "Eric Blake (Free Software Programmer) 
<ebb9@byu.net>" [full]
# gpg:                 aka "[jpeg image of size 6874]" [full]
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-nbd-2020-10-08-v3:
  nbd: Simplify meta-context parsing
  nbd/server: Reject embedded NUL in NBD strings
  qemu-nbd: Honor SIGINT and SIGHUP
  block/nbd: nbd_co_reconnect_loop(): don't connect if drained
  block/nbd: fix reconnect-delay
  block/nbd: correctly use qio_channel_detach_aio_context when needed
  block/nbd: fix drain dead-lock because of nbd reconnect-delay
  nbd: silence maybe-uninitialized warnings

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>


Compare: https://github.com/qemu/qemu/compare/4a7c0bd9dcb0...b433f2cb0115



reply via email to

[Prev in Thread] Current Thread [Next in Thread]