qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] coroutine-sigaltstack: Add SIGUSR2 mutex


From: Max Reitz
Subject: Re: [PATCH] coroutine-sigaltstack: Add SIGUSR2 mutex
Date: Tue, 26 Jan 2021 14:16:01 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0

On 26.01.21 13:44, Vladimir Sementsov-Ogievskiy wrote:
25.01.2021 15:03, Max Reitz wrote:
Disposition (action) for any given signal is global for the process.
When two threads run coroutine-sigaltstack's qemu_coroutine_new()
concurrently, they may interfere with each other: One of them may revert
the SIGUSR2 handler to SIG_DFL, between the other thread (a) setting up
coroutine_trampoline() as the handler and (b) raising SIGUSR2.  That
SIGUSR2 will then terminate the QEMU process abnormally.

We have to ensure that only one thread at a time can modify the
process-global SIGUSR2 handler.  To do so, wrap the whole section where
that is done in a mutex.

Alternatively, we could for example have the SIGUSR2 handler always be
coroutine_trampoline(), so there would be no need to invoke sigaction()
in qemu_coroutine_new().  Laszlo has posted a patch to do so here:

   https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg05962.html

However, given that coroutine-sigaltstack is more of a fallback
implementation for platforms that do not support ucontext, that change
may be a bit too invasive to be comfortable with it.  The mutex proposed
here may negatively impact performance, but the change is much simpler.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
  util/coroutine-sigaltstack.c | 9 +++++++++
  1 file changed, 9 insertions(+)

diff --git a/util/coroutine-sigaltstack.c b/util/coroutine-sigaltstack.c
index aade82afb8..e99b8a4f9c 100644
--- a/util/coroutine-sigaltstack.c
+++ b/util/coroutine-sigaltstack.c
@@ -157,6 +157,7 @@ Coroutine *qemu_coroutine_new(void)
      sigset_t sigs;
      sigset_t osigs;
      sigjmp_buf old_env;
+    static pthread_mutex_t sigusr2_mutex = PTHREAD_MUTEX_INITIALIZER;
      /* The way to manipulate stack is with the sigaltstack function. We
       * prepare a stack, with it delivering a signal to ourselves and then
@@ -186,6 +187,12 @@ Coroutine *qemu_coroutine_new(void)
      sa.sa_handler = coroutine_trampoline;
      sigfillset(&sa.sa_mask);
      sa.sa_flags = SA_ONSTACK;
+
+    /*
+     * sigaction() is a process-global operation.  We must not run
+     * this code in multiple threads at once.
+     */
+    pthread_mutex_lock(&sigusr2_mutex);
      if (sigaction(SIGUSR2, &sa, &osa) != 0) {
          abort();
      }
@@ -234,6 +241,8 @@ Coroutine *qemu_coroutine_new(void)
       * Restore the old SIGUSR2 signal handler and mask
       */
      sigaction(SIGUSR2, &osa, NULL);
+    pthread_mutex_unlock(&sigusr2_mutex);
+
      pthread_sigmask(SIG_SETMASK, &osigs, NULL);
      /*


weak:
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Side thought: so, sigaltstack coroutine implementation is not thread-safe. Is that the only bug?

It would be great if I could tell you for sure whether there’s no bug in some piece of code. :)

Or actually, the whole implementation should be revisited to check, could it be used with iothreads or not?

Judging from the discussion I had with Laszlo, I’m definitely not the right person to do so, because for example I don’t know the ins and outs of signal handling.

I can only tell you it’s the only issue I’ve seen, and that there’s just not much more code in coroutine-sigaltstack.c than the code around qemu_coroutine_new().

Shouldn't we just state that sigaltstack coroutine implementation doesn't support iothreads? And do error out on iothread creation if sigaltstack coroutines is in use?

I’m not sure whether that would be better than potentially having a bug in it. What you’re proposing is effectively breaking all iothreads usage on MacOS. If I were a MacOS user, I’d rather risk encountering bugs than that.

(And it isn’t like we know it’s unstable with iothreads; I haven’t seen it breaking with this patch applied yet, and I don’t think there’s reason to believe it would be. qemu_coroutine_new() together with coroutine_trampoline() sets up a coroutine environment, and the rest of the code just consists of sigsetjmp() and siglongjmp(). I believe Laszlo hat some open questions about signal masking done by those functions, but I don’t think that has anything to do with multithreading.)

Max




reply via email to

[Prev in Thread] Current Thread [Next in Thread]