Re: [PATCH 0/3] Fix dt-validate issues on qemu dtbdumps due to dt-bindin

From: Conor.Dooley
Subject: Re: [PATCH 0/3] Fix dt-validate issues on qemu dtbdumps due to dt-bindings
Date: Mon, 15 Aug 2022 19:18:02 +0000
Any takers on trashing my regex? Otherwise I'll just submit
a v2 with the regex and it can be shat on there instead :)

On 09/08/2022 19:36, Conor Dooley wrote:
> On 09/08/2022 15:14, Rob Herring wrote:
>> On Mon, Aug 08, 2022 at 10:01:11PM +0000, Conor.Dooley@microchip.com wrote:
>>> On 08/08/2022 22:34, Jessica Clarke wrote:
>>>> On Fri, Aug 05, 2022 at 05:28:42PM +0100, Conor Dooley wrote:
>>>>> From: Conor Dooley <conor.dooley@microchip.com>
>>>>> The final patch adds some new ISA strings
>>>>> which needs scruitiny from someone with more knowledge about what ISA
>>>>> extension strings should be reported in a dt than I have.
>>>> Listing every possible ISA string supported by the Linux kernel really
>>>> is not going to scale...
>> How does the kernel scale? (No need to answer)
>>> Yeah, totally correct there. Case for adding a regex I suppose, but I
>>> am not sure how to go about handling the multi-letter extensions or
>>> if parsing them is required from a binding compliance point of view.
>>> Hoping for some input from Palmer really.
>> Yeah, looks like a regex pattern is needed.
> I started pottering away at this but I have arrived at:
> rv64imaf?d?c?h?(_z[imafdqcbvkh]([a-z])*)*$
> I suspect that before "h?" there should be more single letter
> extensions added for completeness sake. So then it'd bloat out to:
> rv64imaf?d?q?c?b?v?k?h?(_z[imafdqcbvkh]([a-z])*)*$
> I checked a couple different "bad" isa strings against it and
> nothing went up in flames but my regex skills are far from great
> so I'm sure there's better ways to represent this.
> Anyways, this pattern is based on my understanding that:
> - the single letter order is fixed & we don't care about things that
>   can't even do "ima"
> - the multi letter extensions are all in a "_z<foo>" format where the
>   first letter of <foo> is a valid single letter extension
> - we don't care about the e extension from an OS PoV (this could be a
>   very flawed take...)
> - after the first two chars, the extension name could be an english
>   word (ifencei anyone?) so it's not worth restricting the charset
> - that attempting to validate the contents of the multiletter extensions
>   with dt-validate beyond the formatting is a futile, massively verbose
>   or unwieldy exercise at best
> Some or all of those assumptions could be very very wrong so if {someone,
> anyone} wants to correct me - feel ***more*** than free.. 
> Thanks,
> Conor.
> patch would then look like:
> diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml 
> b/Documentation/devicetree/bindings/riscv/cpus.yaml
> index d632ac76532e..1e54e7746190 100644
> --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
> +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
> @@ -74,9 +74,7 @@ properties:
>        insensitive, letters in the riscv,isa string must be all
>        lowercase to simplify parsing.
>      $ref: "/schemas/types.yaml#/definitions/string"
> -    enum:
> -      - rv64imac
> -      - rv64imafdc
> +    pattern: rv64imaf?d?q?c?b?v?k?h?(_z[imafdqcbvkh]([a-z])*)*$
>    # RISC-V requires 'timebase-frequency' in /cpus, so disallow it here
>    timebase-frequency: false

