qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 6/7] tcg: implement JIT for iOS and Apple Silicon


From: Joelle van Dyne
Subject: Re: [PATCH v5 6/7] tcg: implement JIT for iOS and Apple Silicon
Date: Fri, 20 Nov 2020 09:58:30 -0600

On Fri, Nov 20, 2020 at 3:08 AM Alexander Graf <agraf@csgraf.de> wrote:
>
>
> On 09.11.20 00:24, Joelle van Dyne wrote:
> > When entitlements are available (macOS or jailbroken iOS), a hardware
> > feature called APRR exists on newer Apple Silicon that can cheaply mark JIT
> > pages as either RX or RW. Reverse engineered functions from
> > libsystem_pthread.dylib are implemented to handle this.
> >
> > The following rules apply for JIT write protect:
> >    * JIT write-protect is enabled before tcg_qemu_tb_exec()
> >    * JIT write-protect is disabled after tcg_qemu_tb_exec() returns
> >    * JIT write-protect is disabled inside do_tb_phys_invalidate() but if it
> >      is called inside of tcg_qemu_tb_exec() then write-protect will be
> >      enabled again before returning.
> >    * JIT write-protect is disabled by cpu_loop_exit() for interrupt 
> > handling.
> >    * JIT write-protect is disabled everywhere else.
> >
> > See 
> > https://developer.apple.com/documentation/apple_silicon/porting_just-in-time_compilers_to_apple_silicon
> >
> > Signed-off-by: Joelle van Dyne <j@getutm.app>
> > ---
> >   include/exec/exec-all.h     |  2 +
> >   include/tcg/tcg-apple-jit.h | 86 +++++++++++++++++++++++++++++++++++++
> >   include/tcg/tcg.h           |  3 ++
> >   accel/tcg/cpu-exec-common.c |  2 +
> >   accel/tcg/cpu-exec.c        |  2 +
> >   accel/tcg/translate-all.c   | 46 ++++++++++++++++++++
> >   tcg/tcg.c                   |  4 ++
> >   7 files changed, 145 insertions(+)
> >   create mode 100644 include/tcg/tcg-apple-jit.h
> >
> > diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> > index aa65103702..3829f3d470 100644
> > --- a/include/exec/exec-all.h
> > +++ b/include/exec/exec-all.h
> > @@ -549,6 +549,8 @@ TranslationBlock *tb_htable_lookup(CPUState *cpu, 
> > target_ulong pc,
> >                                      target_ulong cs_base, uint32_t flags,
> >                                      uint32_t cf_mask);
> >   void tb_set_jmp_target(TranslationBlock *tb, int n, uintptr_t addr);
> > +void tb_exec_lock(void);
> > +void tb_exec_unlock(void);
> >
> >   /* GETPC is the true target of the return instruction that we'll execute. 
> >  */
> >   #if defined(CONFIG_TCG_INTERPRETER)
> > diff --git a/include/tcg/tcg-apple-jit.h b/include/tcg/tcg-apple-jit.h
> > new file mode 100644
> > index 0000000000..9efdb2000d
> > --- /dev/null
> > +++ b/include/tcg/tcg-apple-jit.h
> > @@ -0,0 +1,86 @@
> > +/*
> > + * Apple Silicon functions for JIT handling
> > + *
> > + * Copyright (c) 2020 osy
> > + *
> > + * This library is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * This library is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with this library; if not, see 
> > <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#ifndef TCG_APPLE_JIT_H
> > +#define TCG_APPLE_JIT_H
> > +
> > +/*
> > + * APRR handling
> > + * Credits to: https://siguza.github.io/APRR/
> > + * Reversed from /usr/lib/system/libsystem_pthread.dylib
> > + */
> > +
> > +#if defined(__aarch64__) && defined(CONFIG_DARWIN)
> > +
> > +#define _COMM_PAGE_START_ADDRESS        (0x0000000FFFFFC000ULL) /* In 
> > TTBR0 */
> > +#define _COMM_PAGE_APRR_SUPPORT         (_COMM_PAGE_START_ADDRESS + 0x10C)
> > +#define _COMM_PAGE_APPR_WRITE_ENABLE    (_COMM_PAGE_START_ADDRESS + 0x110)
> > +#define _COMM_PAGE_APRR_WRITE_DISABLE   (_COMM_PAGE_START_ADDRESS + 0x118)
> > +
> > +static __attribute__((__always_inline__)) bool 
> > jit_write_protect_supported(void)
> > +{
> > +    /* Access shared kernel page at fixed memory location. */
> > +    uint8_t aprr_support = *(volatile uint8_t *)_COMM_PAGE_APRR_SUPPORT;
> > +    return aprr_support > 0;
> > +}
> > +
> > +/* write protect enable = write disable */
> > +static __attribute__((__always_inline__)) void jit_write_protect(int 
> > enabled)
> > +{
> > +    /* Access shared kernel page at fixed memory location. */
> > +    uint8_t aprr_support = *(volatile uint8_t *)_COMM_PAGE_APRR_SUPPORT;
> > +    if (aprr_support == 0 || aprr_support > 3) {
> > +        return;
> > +    } else if (aprr_support == 1) {
> > +        __asm__ __volatile__ (
> > +            "mov x0, %0\n"
> > +            "ldr x0, [x0]\n"
> > +            "msr S3_4_c15_c2_7, x0\n"
> > +            "isb sy\n"
> > +            :: "r" (enabled ? _COMM_PAGE_APRR_WRITE_DISABLE
> > +                            : _COMM_PAGE_APPR_WRITE_ENABLE)
> > +            : "memory", "x0"
> > +        );
> > +    } else {
> > +        __asm__ __volatile__ (
> > +            "mov x0, %0\n"
> > +            "ldr x0, [x0]\n"
> > +            "msr S3_6_c15_c1_5, x0\n"
> > +            "isb sy\n"
> > +            :: "r" (enabled ? _COMM_PAGE_APRR_WRITE_DISABLE
> > +                            : _COMM_PAGE_APPR_WRITE_ENABLE)
> > +            : "memory", "x0"
> > +        );
> > +    }
> > +}
>
>
> Is there a particular reason you're not just calling
> pthread_jit_write_protect_np()? That would remove the dependency on
> anything reverse engineered.
Those APIs are not available on iOS 13 or below, which has the same
APRR requirements. If for legal reasons we cannot include this code,
then it is fine to remove this file and replace the calls with the
APIs, but we would lose support on lower iOS versions.

>
>
> > +
> > +#else /* defined(__aarch64__) && defined(CONFIG_DARWIN) */
> > +
> > +static __attribute__((__always_inline__)) bool 
> > jit_write_protect_supported(void)
> > +{
> > +    return false;
> > +}
> > +
> > +static __attribute__((__always_inline__)) void jit_write_protect(int 
> > enabled)
> > +{
> > +}
> > +
> > +#endif
> > +
> > +#endif /* define TCG_APPLE_JIT_H */
> > diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
> > index 477919aeb6..b16b687d0b 100644
> > --- a/include/tcg/tcg.h
> > +++ b/include/tcg/tcg.h
> > @@ -625,6 +625,9 @@ struct TCGContext {
> >       size_t code_gen_buffer_size;
> >       void *code_gen_ptr;
> >       void *data_gen_ptr;
> > +#if defined(CONFIG_DARWIN) && !defined(CONFIG_TCG_INTERPRETER)
> > +    bool code_gen_locked; /* on Darwin each thread tracks W^X flags */
>
>
> I don't quite understand why you need to keep track of whether you're in
> locked state or not. If you just always keep in locked state and unlock
> around the few parts that modify the code gen region, you should be
> fine, no?
I thought so at first, but do_tb_phys_invalidate() can be called in
either state and even when looking at all the callers it's not
possible to easily derive the lock state without storing this
somewhere. If someone knows of a way, then this flag can be removed.

>
>
> > +#endif
> >
> >       /* Threshold to flush the translated code buffer.  */
> >       void *code_gen_highwater;
> > diff --git a/accel/tcg/cpu-exec-common.c b/accel/tcg/cpu-exec-common.c
> > index 12c1e3e974..f1eb767b02 100644
> > --- a/accel/tcg/cpu-exec-common.c
> > +++ b/accel/tcg/cpu-exec-common.c
> > @@ -64,6 +64,8 @@ void cpu_reloading_memory_map(void)
> >
> >   void cpu_loop_exit(CPUState *cpu)
> >   {
> > +    /* Unlock JIT write protect if applicable. */
> > +    tb_exec_unlock();
>
>
> Why do you need to unlock here? I think in general this patch is trying
> to keep the state RW always and only flip to RX when actually executing
> code, right?
Yes, this point is when the code exits due to an interrupt or some
other async means. Otherwise, the unlock would be matched after
tcg_qemu_tb_exec.

-j

>
> I think it would be much easier and cleaner to do it reverse: Keep it in
> RX always and flip to RW when you need to modify.
>
> Also, shouldn't the code gen buffer be allocated with MAP_JIT according
> to the porting guide?
>
> Alex
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]