I've been hoping this sort of target-normalizing effort would occur at some point, so glad to see your email! I largely rewrote config.sub in recent years to make it's parsing more systematic, and have tried to organize the platforms for my distro (NixOS) too, so the friction between various software here has bugged me for a while. I would try to rope in LLVM too, to really get this done once and for all. (Maybe we can make some structured e.g. JSON normalizations too :)).
That would be nice, though potentially a significant effort. I found the problem specifically with rustc (where x86_64-pc-linux-gnu, the canonical host target for my local system via config.guess, was not valid as-is to be passed to rustc --target, and wrote a however many line it is now m4 macro to handle this), though I have noted issues in llvm (cough x86_64-pc-elf -> x86_64-pc-unknown-elf).
I think the biggest sticking point will be how the 3rd and 4th components are handled. config.sub conventionally treats the 4th as the OS, and the 3rd is extra kernel info. LLVM treats the 3rd as the OS, and the 4tht as extra ABI info. (I think this confusion arose due to different interpretations of "..linux-gnu"!)
The LLVM way appears to be winning, and thus config.sub now has some special cases to support it, but nothing systematic yet.
I always interpreted the second part of the system component as the environment, which is a mix of abi, platform lib, and sometimes the object format. This may be a reasonable middle-ground between llvm and config.sub. However, that may be something to debate in the normalization effort.
On Thu, Jul 22, 2021, at 12:08 PM, connor horman wrote:
Of the 167 targets that rustc accepts, 52 are not accepted. Attached is the list of the unaccepted targets (unsupported). I have also attached the output from config.sub in the failing cases (comprehensive), as well as the script used to generate the list (test-rustc-targets).