qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v6 00/12] cpu-exec: Safe work in quiescent state


From: Alex Bennée
Subject: Re: [Qemu-devel] [PATCH v6 00/12] cpu-exec: Safe work in quiescent state
Date: Mon, 05 Sep 2016 16:51:43 +0100
User-agent: mu4e 0.9.17; emacs 25.1.11

Paolo Bonzini <address@hidden> writes:

> This is not the code I promised at the beginning of the week.
> It's better, and not only in that this one works. :)  Instead of
> reinventing the wheel and using the new wheel for linux-user's
> start_exclusive/end_exclusive, the linux-user/ code is moved to
> cpus-common.c and reused as the synchronization mechanism behind
> async_safe_run_on_cpu.
>
> The code works and actually satisfies our needs very well.  The only
> disadvantage is that safe work items will run with a mutex taken; the
> mutex fits decently in QEMU's hierarchy however, sitting between the
> BQL and the tb_lock.

The series is looking good. I only had a few comments on comments.

So for testing I applied:

1 file changed, 16 insertions(+), 5 deletions(-)
translate-all.c | 21 ++++++++++++++++-----

modified   translate-all.c
@@ -20,6 +20,9 @@
 #include <windows.h>
 #endif
 #include "qemu/osdep.h"
+#include <sys/types.h>
+#include <unistd.h>
+#include <sys/syscall.h>


 #include "qemu-common.h"
@@ -57,7 +60,7 @@
 #include "exec/log.h"

 //#define DEBUG_TB_INVALIDATE
-//#define DEBUG_FLUSH
+#define DEBUG_FLUSH
 /* make various TB consistency checks */
 //#define DEBUG_TB_CHECK

@@ -456,7 +459,8 @@ static inline PageDesc *page_find(tb_page_addr_t index)
    indicated, this is constrained by the range of direct branches on the
    host cpu, as used by the TCG implementation of goto_tb.  */
 #if defined(__x86_64__)
-# define MAX_CODE_GEN_BUFFER_SIZE  (2ul * 1024 * 1024 * 1024)
+/* # define MAX_CODE_GEN_BUFFER_SIZE  (2ul * 1024 * 1024 * 1024) */
+# define MAX_CODE_GEN_BUFFER_SIZE  (256 * 1024)
 #elif defined(__sparc__)
 # define MAX_CODE_GEN_BUFFER_SIZE  (2ul * 1024 * 1024 * 1024)
 #elif defined(__powerpc64__)
@@ -835,18 +839,25 @@ static void page_flush_tb(void)
 static void do_tb_flush(CPUState *cpu, void *data)
 {
     unsigned tb_flush_req = (unsigned) (uintptr_t) data;
-
+    unsigned tb_flush_count;
     tb_lock();

     /* If it's already been done on request of another CPU,
      * just retry.
      */
-    if (atomic_read(&tcg_ctx.tb_ctx.tb_flush_count) != tb_flush_req) {
+    tb_flush_count = atomic_read(&tcg_ctx.tb_ctx.tb_flush_count);
+    if (tb_flush_count != tb_flush_req) {
+#if defined(DEBUG_FLUSH)
+        fprintf(stderr,"%s: (%d/%ld) skipping racing flush %x/%x\n", __func__,
+                getpid(), syscall(SYS_gettid), tb_flush_req, tb_flush_count);
+#endif
         goto done;
     }

 #if defined(DEBUG_FLUSH)
-    printf("qemu: flush code_size=%ld nb_tbs=%d avg_tb_size=%ld\n",
+    fprintf(stderr, "%s: (%d/%ld, %x/%x) code_size=%ld nb_tbs=%d 
avg_tb_size=%ld\n", __func__,
+            getpid(), syscall(SYS_gettid),
+            tb_flush_req, tb_flush_count,
            (unsigned long)(tcg_ctx.code_gen_ptr - tcg_ctx.code_gen_buffer),
            tcg_ctx.tb_ctx.nb_tbs, tcg_ctx.tb_ctx.nb_tbs > 0 ?
            ((unsigned long)(tcg_ctx.code_gen_ptr - tcg_ctx.code_gen_buffer)) /

And then ran the test in my pigz testing image:

    ./tests/docker/docker.py update pigz-test:armhf ./arm-linux-user/qemu-arm
    retry.py -n 20 -c -- docker run -i -t --rm pigz-test:armhf pigz data.tar

An observed the results of flushing including racing flushes which would
ordinarily take out the system.

It would be nice to resurrect some of the tests/tcg stuff and see if we
can stress linux-user with creating and destroying threads. I guess the
work here also replaces Emilo's RCU cpu list.

>
> For performance, the last patch changes it to avoid condition variables
> in the fast path.  (There are still two memory barriers; if desired they
> could be merged with the ones in rcu_read_lock/rcu_read_unlock).  I am
> including a formal model of the algorithm; together with new documentation
> in include/qom/cpu.h, it accounts for most of the added lines of code.
> Still, it is completely optional.
>
> Paolo
>
> Alex Bennée (1):
>   cpus: pass CPUState to run_on_cpu helpers
>
> Paolo Bonzini (5):
>   cpus-common: move CPU list management to common code
>   cpus-common: move exclusive work infrastructure from linux-user
>   cpus-common: always defer async_run_on_cpu work items
>   cpus-common: Introduce async_safe_run_on_cpu()
>   cpus-common: lock-free fast path for cpu_exec_start/end
>
> Sergey Fedorov (6):
>   cpus: Move common code out of {async_, }run_on_cpu()
>   cpus: Rename flush_queued_work()
>   linux-user: Use QemuMutex and QemuCond
>   linux-user: Add qemu_cpu_is_self() and qemu_cpu_kick()
>   cpus-common: move CPU work item management to common code
>   tcg: Make tb_flush() thread safe
>
>  Makefile.target            |   2 +-
>  bsd-user/main.c            |  30 +----
>  cpu-exec.c                 |  12 +-
>  cpus-common.c              | 314 
> +++++++++++++++++++++++++++++++++++++++++++++
>  cpus.c                     |  99 +-------------
>  docs/tcg-exclusive.promela | 224 ++++++++++++++++++++++++++++++++
>  exec.c                     |  30 +----
>  hw/i386/kvm/apic.c         |   3 +-
>  hw/i386/kvmvapic.c         |   6 +-
>  hw/ppc/ppce500_spin.c      |  31 ++---
>  hw/ppc/spapr.c             |   6 +-
>  hw/ppc/spapr_hcall.c       |  17 +--
>  include/exec/cpu-all.h     |   4 +
>  include/exec/cpu-common.h  |   2 +
>  include/exec/exec-all.h    |  11 --
>  include/exec/tb-context.h  |   2 +-
>  include/qom/cpu.h          |  95 +++++++++++++-
>  kvm-all.c                  |  21 +--
>  linux-user/main.c          | 130 ++++++-------------
>  target-i386/helper.c       |  19 ++-
>  target-i386/kvm.c          |   6 +-
>  target-s390x/cpu.c         |   4 +-
>  target-s390x/cpu.h         |   7 +-
>  target-s390x/kvm.c         |  98 +++++++-------
>  target-s390x/misc_helper.c |   4 +-
>  translate-all.c            |  38 ++++--
>  vl.c                       |   1 +
>  27 files changed, 814 insertions(+), 402 deletions(-)
>  create mode 100644 cpus-common.c
>  create mode 100644 docs/tcg-exclusive.promela


--
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]