[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/1] qemu-nbd: fix regression with qemu-nbd --fork run over s
From: |
Eric Blake |
Subject: |
Re: [PATCH 1/1] qemu-nbd: fix regression with qemu-nbd --fork run over ssh |
Date: |
Fri, 14 Jul 2023 10:35:06 -0500 |
User-agent: |
NeoMutt/20230517 |
On Thu, Jul 06, 2023 at 09:15:45PM +0200, Denis V. Lunev wrote:
> Commit e6df58a5578fee7a50bbf36f4a50a2781cff855d
> Author: Hanna Reitz <hreitz@redhat.com>
> Date: Wed May 8 23:18:18 2019 +0200
> qemu-nbd: Do not close stderr
> has introduced an interesting regression. Original behavior of
> ssh somehost qemu-nbd /home/den/tmp/file -f raw --fork
> was the following:
> * qemu-nbd was started as a daemon
> * the command execution is done and ssh exited with success
>
> The patch has changed this behavior and 'ssh' command now hangs forever.
>
> According to the normal specification of the daemon() call, we should
> endup with STDERR pointing to /dev/null. That should be done at the
> very end of the successful startup sequence when the pipe to the
> bootstrap process (used for diagnostics) is no longer needed.
>
> This could be achived in the same way as done for 'qemu-nbd -c' case.
That was commit 0eaf453e, also fixing up e6df58a5.
> STDOUT copying to STDERR does the trick.
It took me a while to see how this worked (the double-negative of
passing 0 to the 'noclose' parameter of daemon() doesn't make it easy
to track what is supposed to happen), but I agree that our goal is
that after daemon()izing, we temporarily set stderr to the inherited
pipe for passing back any startup error messages, then usually want to
restore it to /dev/null for the remainder of the process.
>
> This also leads to proper 'ssh' connection closing which fixes my
> original problem.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Eric Blake <eblake@redhat.com>
> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> CC: Hanna Reitz <hreitz@redhat.com>
> ---
> qemu-nbd.c | 9 +--------
> 1 file changed, 1 insertion(+), 8 deletions(-)
I indeed see how this fixes a regression under 'fork_process'.
However, the code that calls qemu_daemon() is also reachable under the
condition 'device && !verbose'. Does it make sense for either:
qemu-nbd -v -c /dev/nbd0
qemu-nbd -f -v -c /dev/nbd0
as a way to connect to the kernel device, but explicitly ask to remain
verbose, as a way to debug issues in connecting to the kernel via
stderr?
Going back to the emails at the time of Hanna's original commit...
https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg01872.html
I don't see any consideration about that case; capturing the original
stderr was done to fix what looked like an easy bug fix to a botched
implementation of old_stderr in commit ffb31e1d without considering
the ramifications.
But seeing as how pre-existing code DID want at least 'qemu-nbd -v -c
/dev/nbd0' to log indefinitely, I think we need to squash in:
diff --git i/qemu-nbd.c w/qemu-nbd.c
index 4276163564b..4d037798b9b 100644
--- i/qemu-nbd.c
+++ w/qemu-nbd.c
@@ -313,7 +313,7 @@ static void *nbd_client_thread(void *arg)
/* update partition table */
pthread_create(&show_parts_thread, NULL, show_parts, device);
- if (verbose) {
+ if (verbose && !fork_process) {
fprintf(stderr, "NBD device %s is now connected to %s\n",
device, srcpath);
} else {
With that tweak, I'm fine with adding:
Reviewed-by: Eric Blake <eblake@redhat.com>
Do you agree with my tweak? If so, I can queue this through my NBD
tree without needing to see a v2 post. However...
>
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index 4276163564..e9e118dfdb 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -575,7 +575,6 @@ int main(int argc, char **argv)
> bool writethrough = false; /* Client will flush as needed. */
> bool fork_process = false;
> bool list = false;
> - int old_stderr = -1;
> unsigned socket_activation;
> const char *pid_file_name = NULL;
> const char *selinux_label = NULL;
> @@ -930,11 +929,6 @@ int main(int argc, char **argv)
> } else if (pid == 0) {
> close(stderr_fd[0]);
>
> - /* Remember parent's stderr if we will be restoring it. */
> - if (fork_process) {
> - old_stderr = dup(STDERR_FILENO);
> - }
> -
> ret = qemu_daemon(1, 0);
>
> /* Temporarily redirect stderr to the parent's pipe... */
> @@ -1152,8 +1146,7 @@ int main(int argc, char **argv)
Pre-existing, but your patch made me notice nearby (minor corner-case)
bugs:
ret = qemu_daemon(1, 0);
/* Temporarily redirect stderr to the parent's pipe... */
dup2(stderr_fd[1], STDERR_FILENO);
We aren't checking for dup2() failure. But even if it succeeds (which
we expect), it could have modified the contents of errno compared to
what they were after qemu_daemon()...
if (ret < 0) {
error_report("Failed to daemonize: %s", strerror(errno));
...so this could be reporting garbage. I wouldn't mind additional
patches to make the code more robust against unlikely failure
scenarios.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org