|
From: | Richard Henderson |
Subject: | Re: [PATCH] x86: Implement Linear Address Masking support |
Date: | Fri, 8 Apr 2022 07:39:31 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 |
On 4/7/22 08:27, Kirill A. Shutemov wrote:
The fast path does not clear the bits, so you enter the slow path before you get to clearing the bits. You've lost most of the advantage of the tlb already.Sorry for my ignorance, but what do you mean by fast path here? My understanding is that it is the case when tlb_hit() is true and you don't need to get into tlb_fill(). Are we talking about the same scheme?
We are not. Paulo already mentioned the JIT. One example is tcg_out_tlb_load in tcg/i386/tcg-target.c.inc. Obviously, there's an implementation of that for each host architecture in the other tcg/arch/ subdirectories.
I've just now had a browse through the Intel docs, and I see that you're not performing the required modified canonicality check.Modified is effectively done by clearing (and sign-extending) the address before the check.While a proper tagged address will have the tag removed in CR2 during a page fault, an improper tagged address (with bit 63 != {47,56}) should have the original address reported to CR2.Hm. I don't see it in spec. It rather points to other direction: Page faults report the faulting linear address in CR2. Because LAM masking (by sign-extension) applies before paging, the faulting linear address recorded in CR2 does not contain the masked metadata.
# Regardless of the paging mode, the processor performs a modified # canonicality check that enforces that bit 47 of the pointer matches # bit 63. As illustrated in Figure 14-1, bits 62:48 are not checked # and are thus available for software metadata. After this modified # canonicality check is performed, bits 62:48 are masked by # sign-extending the value of bit 47 Note especially that the sign-extension happens after canonicality check.
But what other options do you see. Clering the bits before TLB look up matches the architectural spec and makes INVLPG match described behaviour without special handling.
We have special handling for INVLPG: tlb_flush_page_bits_by_mmuidx. That's how we handle TBI for ARM. You'd supply 48 or 57 here.
r~
[Prev in Thread] | Current Thread | [Next in Thread] |