Re: [PATCH v4 2/2] target/riscv: Enable Zicbo[m,z,p] instructions

Christoph Müllner <cmuellner@linux.com> 於 2022年2月17日週四下午12:00寫道：

On Thu, Feb 17, 2022 at 3:15 AM Weiwei Li <liweiwei@iscas.ac.cn> wrote:

在 2022/2/16 下午11:48, Christoph Muellner 写道:
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 39ffb883fc..04500fe352 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -764,6 +764,10 @@ static Property riscv_cpu_properties[] = {
> DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true),
> DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true),
> DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true),
> + DEFINE_PROP_BOOL("zicbom", RISCVCPU, cfg.ext_icbom, true),
> + DEFINE_PROP_BOOL("zicboz", RISCVCPU, cfg.ext_icboz, true),
> + DEFINE_PROP_UINT16("cbom_blocksize", RISCVCPU, cfg.cbom_blocksize, 64),
> + DEFINE_PROP_UINT16("cboz_blocksize", RISCVCPU, cfg.cboz_blocksize, 64),
Why use two different cache block size here? Is there any new spec
update for this?

No, we are talking about the same specification.

Section 2.7 states the following:
"""
The initial set of CMO extensions requires the following information to be discovered by software:
* The size of the cache block for management and prefetch instructions
* The size of the cache block for zero instructions
* CBIE support at each privilege level
"""

So at least the spec authors did differentiate between the two block sizes as well.

> DEFINE_PROP_BOOL("Zfh", RISCVCPU, cfg.ext_zfh, false),
> DEFINE_PROP_BOOL("Zfhmin", RISCVCPU, cfg.ext_zfhmin, false),
> DEFINE_PROP_BOOL("Zve32f", RISCVCPU, cfg.ext_zve32f, false),
> +
> +/* helper_zicbom_access
> + *
> + * Check access permissions (LOAD, STORE or FETCH as specified in section
> + * 2.5.2 of the CMO specification) for Zicbom, raising either store
> + * page-fault (non-virtualised) or store guest-page fault (virtualised).
> + */
> +static void helper_zicbom_access(CPURISCVState *env, target_ulong address,
> + uintptr_t ra)
> +{
> + int ret;
> + void* phost;
> + int mmu_idx = cpu_mmu_index(env, false);
> +
> + /* Get the size of the cache block for management instructions. */
> + RISCVCPU *cpu = env_archcpu(env);
> + uint16_t cbomlen = cpu->cfg.cbom_blocksize;
> +
> + /* Mask off low-bits to align-down to the cache-block. */
> + address &= ~(cbomlen - 1);
> +
> + /* A cache-block management instruction is permitted to access
> + * the specified cache block whenever a load instruction, store
> + * instruction, or instruction fetch is permitted to access the
> + * corresponding physical addresses.
> + */
> + ret = probe_access_range_flags(env, address, cbomlen, MMU_DATA_LOAD,
> + mmu_idx, true, &phost, ra);
> + if (ret == TLB_INVALID_MASK)
> + ret = probe_access_range_flags(env, address, cbomlen, MMU_INST_FETCH,
> + mmu_idx, true, &phost, ra);
> + if (ret == TLB_INVALID_MASK)
> + probe_access_range_flags(env, address, cbomlen, MMU_DATA_STORE,
> + mmu_idx, false, &phost, ra);
> +}
> +

I think it's a little different here. Probe_access_range_flags may
trigger different execptions for different access_type. For example:

If the page for the address is executable and readable but not
writable, and the access cannot pass the pmp check for all access_type,

it may trigger access fault for load/fetch access, and trigger page
fault for store access.

Just to be clear:
The patch does not trigger any fault for LOAD or FETCH because nonfault is set
to true (6th argument of probe_access_range_flags()).
Only the last call to probe_access_range_flags() raises an exception.

Section 2.5.2 states the following:
"""
If access to the cache block is not permitted, a cache-block management
instruction raises a store page fault or store guest-page fault exception if address translation does not permit any
access or raises a store access fault exception otherwise.
"""

In your scenario we have (1...allowed; 0...not allowed):
* read: perm:1, pmp:0
* fetch: perm:1: pmp:0
* write: perm:0, pmp:0

Address translation would allow read and fetch access, but PMP blocks that.
So the "does not permit any"-part is wrong, therefore we should raise a store page fault.

In fact, I can't predict what will happen, because the code in target/riscv/cpu_helper.c does
not really prioritize page faults or PMP faults. it returns one of them, once they are encountered.

Hi Christoph,

May I ask what does "page faults or PMP faults are not prioritized" here mean?

In target/riscv/cpu_helper.c, if "pmp_violation" flag is not set to true, page fault will be picked.

So as long as the TRANSLATE_PMP_FAIL is returned, it will be indicated as a PMP fault.

(The only exception I can see is that TRANSLATE_PMP_FAIL may be converted to TRANSLATE_G_STAGE_FAIL

if it's the second stage translation and PMP fault on PTE entry's PA.)

As the "final PA" is checked only after the page table is walked,

shouldn't the "pmp_violation" flag only be set after all the translation accesses are checked and granted?

Regards,

Frank Chang

In order to model this properly, we would have to refactor cpu_helper.c to separate page permissions
from PMP. However, that seems a bit out of scope for a Zicbo* support patchset.

I think the final exception should be access fault instead of the page
fault caused by probe_access_range_flags with MMU_DATA_STORE.

Regards,

Weiwei Li

From:	Frank Chang
Subject:	Re: [PATCH v4 2/2] target/riscv: Enable Zicbo[m,z,p] instructions
Date:	Mon, 7 Mar 2022 15:30:40 +0800