|
From: | Shin-ichiro KAWASAKI |
Subject: | Re: [Qemu-devel] sh : performance problem |
Date: | Thu, 05 Mar 2009 00:22:21 +0900 |
User-agent: | Thunderbird 2.0.0.19 (Windows/20081209) |
Paul Brook wrote:
Great :) But we're still far from arm :(By the way, does someone know why there is some kind of "tlb management code" in exec.c ?? Does the SH4 architecture have special features that can't be handled in a generic code ? Or are we just rewriting some code that is already there ... ?I think you're missing the most important difference; SH uses a software managed TLB, whereas ARM uses a hardware managed TLB.The main consequence of this is that we don't have to model the actual ARM TLB at all, it is never directly visible. We effectively implement an infinitely large TLB.For SH the TLB is programmed directly, so we end up having to maintain two TLBs: The qemu TLB and the architectural SH TLB. For correct operation pages must be removed from the qemu TLB when they are evicted/replaced in the SH TLB. The SH TLB is quite small, and flushing qemu TLB entries is quite expensive, so this results in fairly poor performance.MIPS has a similar problem. However in that case the most common TLB operations do not directly expose the TLB state. In particular when setting a new TLB entry it is unspecified which TLB entry is replaced. At that point the OS can't know which ehtry was evicted, so we can lie, and not evict pages until the guest does something that allows it to determine the exact TLB state. In practice this is sufficient to make mips-linux workreasonably well.I'm not sure if the same is posible for SH. It probably depends whether URC is visible to/used by the guest.
Thank you Paul for your clear explanation. It confirms my guess, and answers my question also. As I posted for other mail, I tried to increase the number of tlb entry from 64 to 256 and get performance improvement. This approach modifies real hardware specification including URC, but same SH-Linux kernel works fine. It implies that SH-Linux does not refer URC, and MIPS approach might be the solution for SH too. Regards, Shin-ichiro KAWASAKI
[Prev in Thread] | Current Thread | [Next in Thread] |