qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Disabling TLS address caching to help QEMU on GNU/Linux


From: Richard Biener
Subject: Re: Disabling TLS address caching to help QEMU on GNU/Linux
Date: Thu, 22 Jul 2021 14:12:07 +0200

On Tue, Jul 20, 2021 at 4:54 PM Florian Weimer via Gcc <gcc@gcc.gnu.org> wrote:
>
> Currently, the GNU/Linux ABI does not really specify whether the thread
> pointer (the address of the TCB) may change at a function boundary.
>
> Traditionally, GCC assumes that the ABI allows caching addresses of
> thread-local variables across function calls.  Such caching varies in
> aggressiveness between targets, probably due to differences in the
> choice of -mtls-dialect=gnu and -mtls-dialect=gnu2 as the default for
> the targets.  (Caching with -mtls-dialect=gnu2 appears to be more
> aggressive.)
>
> In addition to that, glibc defines errno as this:
>
> extern int *__errno_location (void) __attribute__ ((__const__));
> #define errno (*__errno_location ())
>
> And the const attribute has the side effect of caching the address of
> errno within the same stack frame.
>
> With stackful coroutines, such address caching is only valid if
> coroutines are only ever resumed on the same thread on which they were
> suspended.  (The C++ coroutine implementation is not stackful and is not
> affected by this at the ABI level.)  Historically, I think we took the
> position that cross-thread resumption is undefined.  But the ABIs aren't
> crystal-clear on this matter.
>
> One important piece of software for GNU is QEMU (not just for GNU/Linux,
> Hurd development also benefits from virtualization).  QEMU uses stackful
> coroutines extensively.  There are some hard-to-change code areas where
> resumption happens across threads unfortunately.  These increasingly
> cause problems with more inlining, inter-procedural analysis, and a
> general push towards LTO (which is also needed for some security
> hardening features).
>
> Should the GNU toolchain offer something to help out the QEMU
> developers?  Maybe GCC could offer an option to disable the caching for
> all TLS models.  glibc could detect that mode based on a new
> preprocessor macro and adjust its __errno_location declaration and
> similar function declarations.  There will be a performance impact of
> this, of course, but it would make the QEMU usage well-defined (at the
> lowest levels).

But how does TLS usage transfer between threads?  On the gimple
level the TLS pointer is not visible and thus we'd happily CSE its address:

__thread int x[2];

void bar (int *);

int *foo(int i)
{
  int *p = &x[i];
  bar (p);
  return &x[i];
}

results in

int * foo (int i)
{
  int * p;
  sizetype _5;
  sizetype _6;

  <bb 2> [local count: 1073741824]:
  _5 = (sizetype) i_1(D);
  _6 = _5 * 4;
  p_2 = &x + _6;
  bar (p_2);
  return p_2;
}

to make this work as expected one would need to expose the TLS pointer
access.

> If this is a programming model that should be supported, then restoring
> some of the optimizations would be possible, by annotating
> context-switching functions and TLS-address-dependent functions.  But I
> think QEMU would immediately benefit from just the simple approach that
> disables address caching of TLS variables.
>
> Thanks,
> Florian
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]