While 32mb is certainly usable a full system boot ends up flushing the
codegen buffer nearly 100 times. Increase the default on 64 bit hosts
to take advantage of all that spare memory. After this change I can
boot my tests system without any TB flushes.
That great, with this change I'm seeing a performance improvement when running the avocado tests for cubieboard.
It runs about 4-5 seconds faster. My host is Ubuntu 18.04 on 64-bit.
I don't know much about the internals of TCG nor how it actually uses the cache,
but it seems logical to me that increasing the cache size would improve performance.
What I'm wondering is: will this also result in TCG translating larger chunks in one shot, so potentially
taking more time to do the translation? If so, could it perhaps affect more latency sensitive code?
Signed-off-by: Alex Bennée <address@hidden>
accel/tcg/translate-all.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 4ce5d1b3931..f7baa512059 100644
@@ -929,7 +929,11 @@ static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1,
# define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
+#if TCG_TARGET_REG_BITS == 32
#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (32 * MiB)
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (2 * GiB)
The qemu process now takes up more virtual memory, about ~2.5GiB in my test, which can be expected with this change.
Is it very likely that the TCG cache will be filled quickly and completely? I'm asking because I also use Qemu to do automated testing
where the nodes are 64-bit but each have only 2GiB physical RAM.
#define DEFAULT_CODE_GEN_BUFFER_SIZE \
(DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \