[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v4 12/21] target-arm: A64: Implement DC ZVA
From: |
Peter Maydell |
Subject: |
Re: [Qemu-devel] [PATCH v4 12/21] target-arm: A64: Implement DC ZVA |
Date: |
Fri, 7 Mar 2014 15:11:37 +0000 |
On 7 March 2014 14:51, Richard Henderson <address@hidden> wrote:
> On 03/06/2014 11:32 AM, Peter Maydell wrote:
>> +/**
>> + * tlb_vaddr_to_host:
>> + * @env: CPUArchState
>> + * @addr: guest virtual address to look up
>> + * @mmu_idx: MMU index to use for lookup
>> + *
>> + * Look up the specified guest virtual index in the TCG softmmu TLB.
>> + * If the TLB contains a host virtual address suitable for direct RAM
>> + * access, then return it. Otherwise (TLB miss, TLB entry is for an
>> + * I/O access, etc) return NULL.
>> + *
>> + * This is the equivalent of the initial fast-path code used by
>> + * TCG backends for guest load and store accesses.
>> + */
>> +static inline void *tlb_vaddr_to_host(CPUArchState *env, target_ulong addr,
>> + int mmu_idx)
>> +{
>> + int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
>> + target_ulong tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
>
> Somewhere I think the function name or at least the block comment should
> indicate that this lookup is for writing, since we hard-code addr_write here.
Doh, yes. I forgot that when I was shifting the code into
a more general function. Is it worth parameterising this for
read vs write lookups?
>> +void HELPER(dc_zva)(CPUARMState *env, uint64_t vaddr_in)
>> +{
>> + /* Implement DC ZVA, which zeroes a fixed-length block of memory.
>> + * Note that we do not implement the (architecturally mandated)
>> + * alignment fault for attempts to use this on Device memory
>> + * (which matches the usual QEMU behaviour of not implementing either
>> + * alignment faults or any memory attribute handling).
>> + */
>> +
>> + ARMCPU *cpu = arm_env_get_cpu(env);
>> + uint64_t blocklen = 4 << cpu->dcz_blocksize;
>> + uint64_t vaddr = vaddr_in & ~(blocklen - 1);
>> +
>> +#ifndef CONFIG_USER_ONLY
>> + {
>> + /* Slightly awkwardly, QEMU's TARGET_PAGE_SIZE may be less than
>> + * the block size so we might have to do more than one TLB lookup.
>> + * We know that in fact for any v8 CPU the page size is at least 4K
>> + * and the block size must be 2K or less, but TARGET_PAGE_SIZE is
>> only
>> + * 1K as an artefact of legacy v5 subpage support being present in
>> the
>> + * same QEMU executable.
>> + */
>> + int maxidx = DIV_ROUND_UP(blocklen, TARGET_PAGE_SIZE);
>> + void *hostaddr[maxidx];
>
> What's the maximum blocksize? Did you really need dynamic allocation here?
Max blocksize architecturally currently is 2K. Dynamic
allocation seemed cleaner code than hardcoding "this will
always have either 1 or 2 elements", though. (The field
in the "how big is a block?" register would allow more than
2K, since that is encoded as 0b1001 in a 4 bit field.)
>
>> + int try, i;
>> +
>> + for (try = 0; try < 2; try++) {
>> +
>> + for (i = 0; i < maxidx; i++) {
>> + hostaddr[i] = tlb_vaddr_to_host(env,
>> + vaddr + TARGET_PAGE_SIZE *
>> i,
>> + cpu_mmu_index(env));
>> + if (!hostaddr[i]) {
>> + break;
>> + }
>> + }
>> + if (i == maxidx) {
>> + /* If it's all in the TLB it's fair game for just writing
>> to;
>> + * we know we don't need to update dirty status, etc.
>> + */
>> + for (i = 0; i < maxidx - 1; i++) {
>> + memset(hostaddr[i], 0, TARGET_PAGE_SIZE);
>> + }
>> + memset(hostaddr[i], 0, blocklen - (i * TARGET_PAGE_SIZE));
>> + return;
>> + }
>> + /* OK, try a store and see if we can populate the tlb. This
>> + * might cause an exception if the memory isn't writable,
>> + * in which case we will longjmp out of here. We must for
>> + * this purpose use the actual register value passed to us
>> + * so that we get the fault address right.
>> + */
>> + cpu_stb_data(env, vaddr_in, 0);
>> + /* Now we can populate the other TLB entries, if any */
>> + for (i = 0; i < maxidx; i++) {
>> + uint64_t va = vaddr + TARGET_PAGE_SIZE * i;
>> + if (va != (vaddr_in & TARGET_PAGE_MASK)) {
>> + cpu_stb_data(env, va, 0);
>> + }
>> + }
>
> cpu_stb_data doesn't take into account user vs kernel mode accesses.
...so what does it use for the mmu index?
> Maybe
> better off using helper_ret_stb_mmu, and passing along GETRA().
OK.
> As a bonus, you'll have accurate exceptions should the access throw, so you
> don't need to force the save of PC before calling the helper. Which... I
> don't
> see you doing, so perhaps there's a bug here at the moment.
Mmm. (In system mode we'll save PC as a side effect of having
an accessfn defined for the DC_ZVA reginfo.)
thanks
-- PMM
- [Qemu-devel] [PATCH v4 00/21] AArch64 system emulation (boots a kernel!), Peter Maydell, 2014/03/06
- [Qemu-devel] [PATCH v4 09/21] target-arm: Fix VFP enables for AArch32 EL0 under AArch64 EL1, Peter Maydell, 2014/03/06
- [Qemu-devel] [PATCH v4 17/21] target-arm: Implement AArch64 SPSR_EL1, Peter Maydell, 2014/03/06
- [Qemu-devel] [PATCH v4 16/21] target-arm: Implement SP_EL0, SP_EL1, Peter Maydell, 2014/03/06
- [Qemu-devel] [PATCH v4 19/21] target-arm: Implement AArch64 EL1 exception handling, Peter Maydell, 2014/03/06
- [Qemu-devel] [PATCH v4 18/21] target-arm: Move arm_log_exception() into internals.h, Peter Maydell, 2014/03/06