guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: llvm on aarch64 builds very slowly


From: Christopher Baines
Subject: Re: llvm on aarch64 builds very slowly
Date: Wed, 23 Feb 2022 17:49:27 +0000
User-agent: mu4e 1.6.10; emacs 27.2

Ricardo Wurmus <rekado@elephly.net> writes:

> Ricardo Wurmus <rekado@elephly.net> writes:
>
>> Hi Guix,
>>
>> I had to manually run the build of llvm 11 on aarch64, because it would
>> keep timing out:
>>
>>     time guix build 
>> /gnu/store/0hc7inxqcczb8mq2wcwrcw0vd3i2agkv-llvm-11.0.0.drv --timeout=999999 
>> --max-silent-time=999999
>>
>> After more than two days it finally built.  This seems a little
>> excessive.  Towards the end of the build I saw a 1% point progress
>> increase for every hour that passed.
>>
>> Is there something wrong with the build nodes, are we building llvm 11
>> wrong, or is this just the way it is on aarch64 systems?
>
> I now see that gfortran 10 also takes a very long time to build.  It’s
> on kreuzberg (10.0.0.9) and I see that out of the 16 cores only *one* is
> really busy.  Other cores sometimes come in with a tiny bit of work, but
> you might miss it if you blink.
>
> Guix ran “make -j 16” at the top level, but the other make processes
> that have been spawned as children do not have “-j 16”.  There are
> probably 16 or so invocations of cc1plus, but only CPU0 seems to be busy
> at 100% while the others are at 0.
>
> What’s up with that?

Regarding the llvm derivation you mentioned [1], it looks like for
bordeaux.guix.gnu.org, the build completed in around a couple of hours,
this was on the 4 core Overdrive machine though.

1: 
https://data.guix.gnu.org/gnu/store/0hc7inxqcczb8mq2wcwrcw0vd3i2agkv-llvm-11.0.0.drv

On the subject of the HoneyComb machines, I haven't noticed anything
like you describe with the one (hatysa) running behind
bordeaux.guix.gnu.org. Most cores are fully occupied most of the time,
which the 15m load average sitting around 16.

Some things to check though, what does the load average look like when
you think the system should be using all it's cores? If it's high but
there's not much CPU utilisation, that suggests there's a bottleneck
somewhere else.

Also, what does the memory and swap usage look like? Hatysa has 32GB of
memory and swap, and ideally it would actually have 64GB, since that
would avoid swapping more often.

One problem I have observed with hatysa is storage
instability/performance issues. Looking in /var/log/messages, I see
things like the following. Maybe check /var/log/messages for anything
similar?

  nvme nvme0: I/O 0 QID 6 timeout, aborting
  nvme nvme0: I/O 1 QID 6 timeout, aborting
  nvme nvme0: I/O 2 QID 6 timeout, aborting
  nvme nvme0: I/O 3 QID 6 timeout, aborting
  nvme nvme0: Abort status: 0x0
  nvme nvme0: Abort status: 0x0
  nvme nvme0: Abort status: 0x0
  nvme nvme0: Abort status: 0x0

Lastly, I'm not quite sure what thermal problems look like on ARM, but
maybe check the CPU temps. I see between 60 and 70 degrees as reported
by the sensors command, this is with a different CPU cooler though.

Chris

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]