bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#49439: grafts cause “guix environment” to get killed with OOM


From: Ludovic Courtès
Subject: bug#49439: grafts cause “guix environment” to get killed with OOM
Date: Mon, 09 Aug 2021 23:26:03 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Hi!

Sarah Morgensen <iskarian@mgsn.dev> skribis:

> Ludovic Courtès <ludo@gnu.org> writes:

[...]

>> The seemingly endless CPU usage and unbound memory use comes from the
>> ‘build-accumulator’ build handler.  Here, the graph of ‘pigx-scrnaseq’
>> has many nodes, and many of them depend on, say, ‘r-rlang’.  Thus, when
>> ‘r-rlang’ is not in the store, the grafting code keeps asking for it by
>> calling ‘build-derivations’, which aborts to the build handler; the
>> build handler saves the .drv file name and the continuation and keeps
>> going.  But since the next package also depends on ‘r-langr’, we abort
>> again to the build handler, and so on.
>>
>> The end result is a very long list of <unresolved> nodes, probably of
>> this order in this case:
>>
>>   $ guix graph -t reverse-package r-rlang |grep 'label = "'|wc -l
>>   594
>>
>> Presumably, the captured continuations occupy quite a bit of memory,
>> hence the quick growth.
>>
>> I suppose one solution is to fire suspended builds when the build
>> handler sees a build request for a given derivation for the second time.
>> It needs more thought and testing…
>
> I wonder if instead it's possible to reduce the memory taken by the
> continuations? As someone who has absolutely no experience with the
> build/derivation system, it seems like all we *should* need to save is
> the derivation file name.

Continuations may take a bit of room but the main problem is that we’re
storing too many of them.  :-)

Roughly, I think we keep one <unresolved> node per incoming edge into a
package not in the store (‘r-rlang’ in the example above).  In a case
like ‘pigx-scrnaseq’, that’s a lot of items.

I added counters and ‘pk’ calls in ‘map/accumulate-builds’ & co. to see
what happens.  AFAICS, a cutoff as in the attached patch does the job.
It’s a basic heuristic to avoid unbounded growth, at the expense of
possibly reduced accumulation.  For example, here’s the effect on my
user profile with 300+ packages:

--8<---------------cut here---------------start------------->8---
$ # with cutoff
$ time GUIX_PROFILING=gc ./pre-inst-env guix upgrade -n

[...]


2,926.7 MB would be downloaded
Garbage collection statistics:
  heap size:        119.37 MiB
  allocated:        849.56 MiB
  GC times:         30
  time spent in GC: 2.88 seconds (30% of user time)

real    0m8.987s
user    0m9.482s
sys     0m0.186s
$ 
$ # without cutoff
$ time GUIX_PROFILING=gc ./pre-inst-env guix upgrade -n

[...]

3,402.5 MB would be downloaded
Garbage collection statistics:
  heap size:        127.37 MiB
  allocated:        1504.59 MiB
  GC times:         46
  time spent in GC: 5.31 seconds (29% of user time)

real    0m21.616s
user    0m18.082s
sys     0m0.255s
--8<---------------cut here---------------end--------------->8---

You can tell that, without the cutoff (2nd run), the command more
accurately finds out what it’s going to have to download since the
“would be downloaded” figure is higher; with the cutoff (1st run), it
“sees less” of what it’s going to end up downloading.  This could be
annoying from a UI viewpoint in “extreme” cases like this one, but most
likely the difference would be invisible in many cases.

More importantly, having this cutoff halves the execution time, which is
nice.

The command initially given in this bug report now behaves like this:

--8<---------------cut here---------------start------------->8---
$ time GUIX_PROFILING=gc  ./pre-inst-env guix environment pigx-scrnaseq 
--search-paths -n -v2
41.8 MB would be downloaded:
   /gnu/store/difgj9gf8l7y5bsilcl42x2vzgh493c6-r-utf8-1.2.2
   /gnu/store/z4rpk1sn3jhy8brsr30zccln8mira3z5-r-purrr-0.3.4
   /gnu/store/nkiga9rfd8a9gdfsp2rjcqz39s8p2hg3-r-magrittr-2.0.1
   /gnu/store/c5y0xlw1fbcwa5p5qnk4xji286hd7cqk-r-vctrs-0.3.8
   /gnu/store/86vz257dqy4vm6s7axq7zl2jqxacgngf-r-ellipsis-0.3.2
   /gnu/store/m2m7h497g5n6vdrp5pxsnr2v8hsriq5f-r-glue-1.4.2
   /gnu/store/0zv1sl91fz3ddq8namdfvf6yr0j1p2vx-texlive-bin-20190410
   /gnu/store/7ks38sv5cpg7x76g2fdp9cra3x7dpbyq-texlive-union-51265
   /gnu/store/vkdjhkl5s025jsjy7jrr9q7kxlsi2sc4-r-minimal-4.1.0
   /gnu/store/pysc6ixb5fj1m2n50dac6aq5lgd5bnv4-r-rlang-0.4.11
Garbage collection statistics:
  heap size:        127.31 MiB
  allocated:        621.97 MiB
  GC times:         24
  time spent in GC: 2.82 seconds (37% of user time)

real    0m6.493s
user    0m7.584s
sys     0m0.117s
--8<---------------cut here---------------end--------------->8---

This time it completes, which is already an improvement ;-).  The
41.8 MB download reported is underestimated, but again that’s the
tradeoff we’re making.

Attached is the patch.  I’ll go ahead with that if there are no
objections.

> For those interested, I've compiled a list of the top 60
> highest-dependency packages, as reported by package-closure:
>
> pigx                  1630
> nextcloud-client      1539
> akregator             1486
> kmail                 1484
> korganizer            1481
> kdepim-runtime        1480
> kincidenceeditor      1478
> keventviews           1477
> kmailcommon           1476
> kcalendarsupport      1475

Nice!  I really need to stop taking ‘libreoffice’ as an example.

Thanks,
Ludo’.

diff --git a/guix/store.scm b/guix/store.scm
index 1ab2b08b47..340ad8bc10 100644
--- a/guix/store.scm
+++ b/guix/store.scm
@@ -1358,11 +1358,27 @@ on the build output of a previous derivation."
 (define (map/accumulate-builds store proc lst)
   "Apply PROC over each element of LST, accumulating 'build-things' calls and
 coalescing them into a single call."
-  (define result
-    (map (lambda (obj)
-           (with-build-handler build-accumulator
-             (proc obj)))
-         lst))
+  (define accumulation-cutoff
+    ;; Threshold above which we stop accumulating unresolved nodes to avoid
+    ;; pessimal behavior.  See <https://bugs.gnu.org/49439>.
+    30)
+
+  (define-values (result rest)
+    (let loop ((lst lst)
+               (result '())
+               (unresolved 0))
+      (match lst
+        ((head . tail)
+         (match (with-build-handler build-accumulator
+                  (proc head))
+           ((? unresolved? obj)
+            (if (> unresolved accumulation-cutoff)
+                (values (reverse (cons obj result)) tail)
+                (loop tail (cons obj result) (+ 1 unresolved))))
+           (obj
+            (loop tail (cons obj result) unresolved))))
+        (()
+         (values (reverse result) lst)))))
 
   (match (append-map (lambda (obj)
                        (if (unresolved? obj)
@@ -1370,6 +1386,7 @@ coalescing them into a single call."
                            '()))
                      result)
     (()
+     ;; REST is necessarily empty.
      result)
     (to-build
      ;; We've accumulated things TO-BUILD.  Actually build them and resume the
@@ -1382,7 +1399,7 @@ coalescing them into a single call."
                                   ;; unnecessary.
                                   ((unresolved-continuation obj) #f)
                                   obj))
-                            result))))
+                            (append result rest)))))
 
 (define build-things
   (let ((build (operation (build-things (string-list things)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]