[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] ideas for improving TLB performance (help with TCG back
From: |
Emilio G. Cota |
Subject: |
Re: [Qemu-devel] ideas for improving TLB performance (help with TCG backend wanted) |
Date: |
Mon, 1 Oct 2018 21:54:49 -0400 |
User-agent: |
Mutt/1.9.4 (2018-02-28) |
On Mon, Oct 01, 2018 at 15:40:37 -0500, Richard Henderson wrote:
> On 10/1/18 1:34 PM, Emilio G. Cota wrote:
> > On Thu, Sep 20, 2018 at 01:19:51 +0100, Alex Bennée wrote:
> >> If we are going to have an indirection then we can also drop the
> >> requirement to scale the TLB according to the number of MMU indexes we
> >> have to support. It's fairly wasteful when a bunch of them are almost
> >> never used unless you are running stuff that uses them.
> >
> > So with dynamic TLB sizing, what you're suggesting here is to resize
> > each MMU array independently (depending on their use rate) instead
> > of using a single "TLB size" for all MMU indexes. Am I understanding
> > your point correctly?
>
> You cannot do that without flushing the TBs (and with out-of-line memory ops,
> the prologue as well) and regenerating. The TLB size is baked into the code.
> And we really don't have any extra registers free to vary that.
Can you please elaborate on this? I can't see where this is
baked into the generated code, other than the TLB lookup.
Grepping for CPU_TLB_SIZE and CPU_TLB_BITS only shows a few
places.
I have written today a prototype of dynamic TLB flushing. It
uses no extra registers because mmu_idx is known at generation time.
I haven't done any extensive testing yet, but at least it boots
aarch64 and x86_64 guests on an x86_64 host.
The code (some messy WIP commits in there, sorry) is at:
https://github.com/cota/qemu/tree/tlb2
Please take a look -- am I doing anything horribly wrong there?
Thanks,
Emilio