qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: comparison of coroutine backends


From: Stefan Hajnoczi
Subject: Re: comparison of coroutine backends
Date: Mon, 21 Mar 2022 10:40:04 +0000

On Fri, Mar 18, 2022 at 09:48:37AM +0100, Paolo Bonzini wrote:
> Hi all,
> 
> based on the previous discussions here is a comparison of the various
> possibilities for implementing coroutine backends in QEMU and the
> respective advantages and disadvantages.
> 
> I'm adding a third possibility for stackless coroutines, which is to
> use the LLVM/clang builtins.  I believe that would still require a
> source-to-source translator, but it would offload to the compiler the
> complicated bits such as liveness analysis.
> 
> 1) Stackful coroutines:
> Advantages:
> - no changes to current code
> 
> Disadvantages:
> - portability issues regarding shadow stacks (SafeStack, CET)
> - portability/nonconformance issues regarding TLS
> 
> Another possible advantage is that it allows using the same function for
> both coroutine and non-coroutine context.  I'm listing this separately
> because I'm not sure that's desirable, as it prevents compile-time
> checking of calls to coroutine_fn.  Compile-time checking would be
> possible using clang -fthread-safety if we forgo the ability to use the
> same function in both scenarios.
> 
> 
> 2) "Duff's device" stackless coroutines
> Advantages:

- Supports gcc and clang

> - no portability issues regarding both shadow stacks and TLS
> - compiles to good old C code
> - compile-time checking of "coroutine-only" but not awaitable functions
> - debuggability: stack frames should be easy to inspect

The user needs to understand how the coroutine runtime works in order to
get a backtrace of a suspended coroutine. More likely a GDB Python
script will be needed for this.

> Disadvantages:
> - complex source-to-source translator
> - more complex build process
> 
> 
> 3) C++20 stackless coroutines
> Advantages:
> - no portability issues regarding both shadow stacks and TLS
> - no code to write outside QEMU
> - simpler build process
> 
> Disadvantages:
> - requires a new compiler
> - it's C++

- raises questions about C++ usage in QEMU, which seem to be
  controversial

> - no compile-time checking of "coroutine-only" but not awaitable functions
> 
> 
> 4) LLVM stackless coroutines
> Advantages:
> - no portability issues regarding both shadow stacks and TLS
> - no code to write outside QEMU
> 
> Disadvantages:
> - relatively simple source-to-source translator
> - more complex build process
> - requires a new compiler and doesn't support GCC
> 
> 
> Note that (2) would still have a build dependency on libclang.
> However the code generation could still be done with GCC and with
> any compiler version.
> 
> I'll also put it in a table, though I understand that some choices
> here might be debatable:
> 
>                          stackful      Duff's device            C++20         
>      LLVM
> ==============================================================================================
> Code to write/maintain    ++ [1]             ---                   +++        
>       - [2]
> Changes to existing code  ++ [3]             -                     --         
>       -
> Community acceptance      ++                 ++                    --         
>       ?
> Code or PoC exists        ++                 +                     -          
>       --
> ==============================================================================================
> Portability               --                 ++                    +          
>       -
> Debuggability             -                  ++                    ?          
>       ?
> Performance               -                  ++ [4]                ++         
>       ++
> 
> [1] I'm penalizing stackful coroutines here because the worse portability
> has an impact on future maintainability too.
> 
> [2] This is an educated guess.
> 
> [3] If we decide to remove the possibility of using the same function for
> both coroutine and non-coroutine context, the changes to existing code
> would be the same as for Duff's device and LLVM coroutines.
> 
> [4] Slightly worse than C++20 coroutines for the PoC, but that is mostly due
> to implementation choices that are easy to change.
> 
> 
> Stackful coroutines are obviously pretty good, or we wouldn't have used them.
> They might be a local optimum though, as shown by the negative points in terms
> of portability, debuggability and performance.
> 
> Both Duff's device and LLVM would be more or less transparent to the part of
> the community that doesn't care about the coroutines.  The translator would
> probably be write-and-forget (though I'm not sure about the API stability of
> libclang, which would be a major factor), but it would still be a substantial
> amount of work to commit to.

I don't see a clear winner but here is my order of preference:
1. Stackful - the devil we know
2. Duff's device - a temporary (wasteful) step before native compiler support?
3. LLVM - actually not bad but requires dropping gcc support
4. C++20 - I worry adding C++ into the codebase will cause friction

Ideally gcc and clang would support C coroutines natively, making the
choice simple. Is it worth treating this as a long term project and
working with LLVM/clang and gcc to add native C coroutine support to
compilers? We still have stackful coroutines in the short term.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]