[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: comparison of coroutine backends
From: |
Stefan Hajnoczi |
Subject: |
Re: comparison of coroutine backends |
Date: |
Mon, 21 Mar 2022 10:40:04 +0000 |
On Fri, Mar 18, 2022 at 09:48:37AM +0100, Paolo Bonzini wrote:
> Hi all,
>
> based on the previous discussions here is a comparison of the various
> possibilities for implementing coroutine backends in QEMU and the
> respective advantages and disadvantages.
>
> I'm adding a third possibility for stackless coroutines, which is to
> use the LLVM/clang builtins. I believe that would still require a
> source-to-source translator, but it would offload to the compiler the
> complicated bits such as liveness analysis.
>
> 1) Stackful coroutines:
> Advantages:
> - no changes to current code
>
> Disadvantages:
> - portability issues regarding shadow stacks (SafeStack, CET)
> - portability/nonconformance issues regarding TLS
>
> Another possible advantage is that it allows using the same function for
> both coroutine and non-coroutine context. I'm listing this separately
> because I'm not sure that's desirable, as it prevents compile-time
> checking of calls to coroutine_fn. Compile-time checking would be
> possible using clang -fthread-safety if we forgo the ability to use the
> same function in both scenarios.
>
>
> 2) "Duff's device" stackless coroutines
> Advantages:
- Supports gcc and clang
> - no portability issues regarding both shadow stacks and TLS
> - compiles to good old C code
> - compile-time checking of "coroutine-only" but not awaitable functions
> - debuggability: stack frames should be easy to inspect
The user needs to understand how the coroutine runtime works in order to
get a backtrace of a suspended coroutine. More likely a GDB Python
script will be needed for this.
> Disadvantages:
> - complex source-to-source translator
> - more complex build process
>
>
> 3) C++20 stackless coroutines
> Advantages:
> - no portability issues regarding both shadow stacks and TLS
> - no code to write outside QEMU
> - simpler build process
>
> Disadvantages:
> - requires a new compiler
> - it's C++
- raises questions about C++ usage in QEMU, which seem to be
controversial
> - no compile-time checking of "coroutine-only" but not awaitable functions
>
>
> 4) LLVM stackless coroutines
> Advantages:
> - no portability issues regarding both shadow stacks and TLS
> - no code to write outside QEMU
>
> Disadvantages:
> - relatively simple source-to-source translator
> - more complex build process
> - requires a new compiler and doesn't support GCC
>
>
> Note that (2) would still have a build dependency on libclang.
> However the code generation could still be done with GCC and with
> any compiler version.
>
> I'll also put it in a table, though I understand that some choices
> here might be debatable:
>
> stackful Duff's device C++20
> LLVM
> ==============================================================================================
> Code to write/maintain ++ [1] --- +++
> - [2]
> Changes to existing code ++ [3] - --
> -
> Community acceptance ++ ++ --
> ?
> Code or PoC exists ++ + -
> --
> ==============================================================================================
> Portability -- ++ +
> -
> Debuggability - ++ ?
> ?
> Performance - ++ [4] ++
> ++
>
> [1] I'm penalizing stackful coroutines here because the worse portability
> has an impact on future maintainability too.
>
> [2] This is an educated guess.
>
> [3] If we decide to remove the possibility of using the same function for
> both coroutine and non-coroutine context, the changes to existing code
> would be the same as for Duff's device and LLVM coroutines.
>
> [4] Slightly worse than C++20 coroutines for the PoC, but that is mostly due
> to implementation choices that are easy to change.
>
>
> Stackful coroutines are obviously pretty good, or we wouldn't have used them.
> They might be a local optimum though, as shown by the negative points in terms
> of portability, debuggability and performance.
>
> Both Duff's device and LLVM would be more or less transparent to the part of
> the community that doesn't care about the coroutines. The translator would
> probably be write-and-forget (though I'm not sure about the API stability of
> libclang, which would be a major factor), but it would still be a substantial
> amount of work to commit to.
I don't see a clear winner but here is my order of preference:
1. Stackful - the devil we know
2. Duff's device - a temporary (wasteful) step before native compiler support?
3. LLVM - actually not bad but requires dropping gcc support
4. C++20 - I worry adding C++ into the codebase will cause friction
Ideally gcc and clang would support C coroutines natively, making the
choice simple. Is it worth treating this as a long term project and
working with LLVM/clang and gcc to add native C coroutine support to
compilers? We still have stackful coroutines in the short term.
Stefan
signature.asc
Description: PGP signature