[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Disabling TLS address caching to help QEMU on GNU/Linux
From: |
Iain Sandoe |
Subject: |
Re: Disabling TLS address caching to help QEMU on GNU/Linux |
Date: |
Tue, 20 Jul 2021 16:31:36 +0100 |
Hi Florian,
This also affects fibres implementations (both C++ and D ones at least from
discussion with both communities).
> On 20 Jul 2021, at 15:52, Florian Weimer via Gcc <gcc@gcc.gnu.org> wrote:
>
> Currently, the GNU/Linux ABI does not really specify whether the thread
> pointer (the address of the TCB) may change at a function boundary.
>
> Traditionally, GCC assumes that the ABI allows caching addresses of
> thread-local variables across function calls. Such caching varies in
> aggressiveness between targets, probably due to differences in the
> choice of -mtls-dialect=gnu and -mtls-dialect=gnu2 as the default for
> the targets. (Caching with -mtls-dialect=gnu2 appears to be more
> aggressive.)
>
> In addition to that, glibc defines errno as this:
>
> extern int *__errno_location (void) __attribute__ ((__const__));
> #define errno (*__errno_location ())
>
> And the const attribute has the side effect of caching the address of
> errno within the same stack frame.
>
> With stackful coroutines, such address caching is only valid if
> coroutines are only ever resumed on the same thread on which they were
> suspended. (The C++ coroutine implementation is not stackful and is not
> affected by this at the ABI level.)
There are C++20 coroutine library writers who want to switch threads in
symmetric transfers [ I am not entirely convinced about this at present and it
certainly would be suspect with TLS address caching enabled - since a TLS
pointer could equally be cached in the coroutine frame ].
The C++20 coroutine ABI is silent on such matters (it only describes the
visible part of the coroutine frame and the builtins used by the std library).
> Historically, I think we took the
> position that cross-thread resumption is undefined. But the ABIs aren't
> crystal-clear on this matter.
> One important piece of software for GNU is QEMU (not just for GNU/Linux,
> Hurd development also benefits from virtualization). QEMU uses stackful
> coroutines extensively. There are some hard-to-change code areas where
> resumption happens across threads unfortunately. These increasingly
> cause problems with more inlining, inter-procedural analysis, and a
> general push towards LTO (which is also needed for some security
> hardening features).
>
> Should the GNU toolchain offer something to help out the QEMU
> developers? Maybe GCC could offer an option to disable the caching for
> all TLS models. glibc could detect that mode based on a new
> preprocessor macro and adjust its __errno_location declaration and
> similar function declarations. There will be a performance impact of
> this, of course, but it would make the QEMU usage well-defined (at the
> lowest levels).
>
> If this is a programming model that should be supported, then restoring
> some of the optimizations would be possible, by annotating
> context-switching functions and TLS-address-dependent functions. But I
> think QEMU would immediately benefit from just the simple approach that
> disables address caching of TLS variables.
IMO the general cases you note above are enough reason to want some
mechanism to control this,
thanks
Iain
>
> Thanks,
> Florian
>