[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] x86: Implement Linear Address Masking support
From: |
Richard Henderson |
Subject: |
Re: [PATCH] x86: Implement Linear Address Masking support |
Date: |
Thu, 7 Apr 2022 07:28:54 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 |
On 4/7/22 06:18, Kirill A. Shutemov wrote:
The new hook is incorrect, in that it doesn't apply to addresses along
the tlb fast path.
I'm not sure what you mean by that. tlb_hit() mechanics works. We strip
the tag bits before tlb lookup.
Could you elaborate?
The fast path does not clear the bits, so you enter the slow path before you get to
clearing the bits. You've lost most of the advantage of the tlb already.
To be honest I don't fully understand how TBI emulation works.
In get_phys_addr_lpae:
addrsize = 64 - 8 * param.tbi;
...
target_ulong top_bits = sextract64(address, inputsize,
addrsize - inputsize);
if (-top_bits != param.select) {
/* The gap between the two regions is a Translation fault */
fault_type = ARMFault_Translation;
goto do_fault;
}
which does not include TBI bits in the validation of the sign-extended address.
Consider store_helper(). I failed to find where tag bits get stripped
before getting there for !CONFIG_USER_ONLY. clean_data_tbi() only covers
user-only case.
And if we get there with tags, I don't see how we will ever get to fast
path: tlb_hit() should never return true there if any bit in top byte is
set as cached tlb_addr has them stripped.
tlb_fill() will get it handled correctly, but it is wasteful to go through
pagewalk on every tagged pointer dereference.
We won't do a pagewalk for every tagged pointer dereference. It'll be pointer
dereferences with differing tags past the limit of the victim cache (CPU_VTLB_SIZE). And
one tag will get to use the fast path, e.g. on the store following a load.
I've just now had a browse through the Intel docs, and I see that you're not performing
the required modified canonicality check. While a proper tagged address will have the tag
removed in CR2 during a page fault, an improper tagged address (with bit 63 != {47,56})
should have the original address reported to CR2.
I could imagine a hook that could aid the victim cache in ignoring the tag, so that we
need go through tlb_fill fewer times. But I wouldn't want to include that in the base
version of this feature, and I'd want take more than a moment in the design so that it
could be used by ARM and RISC-V as well.
r~