[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CPU and GC cost of bignums
From: |
Ludovic Courtès |
Subject: |
Re: CPU and GC cost of bignums |
Date: |
Thu, 06 Feb 2020 14:37:52 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) |
Hi!
Andy Wingo <address@hidden> skribis:
> Nice investigation! Perhaps slot-allocation should track live variables
> using something that's not bigints, but who knows.
Yeah I wondered; it’s not clear whether bitvectors would be more
efficient, for instance, although we could make it perhaps locally
imperative.
> On Wed 05 Feb 2020 17:29, Ludovic Courtès <address@hidden> writes:
>
>> /* The next three functions (custom_libgmp_*) are passed to
>> mp_set_memory_functions (in GMP) so that memory used by the digits
>> themselves is known to the garbage collector. This is needed so
>> @@ -237,19 +227,20 @@ finalize_bignum (void *ptr, void *data)
>> static void *
>> custom_gmp_malloc (size_t alloc_size)
>> {
>> - return scm_malloc (alloc_size);
>> + return scm_gc_malloc (alloc_size, "GMP");
>> }
>>
>> static void *
>> custom_gmp_realloc (void *old_ptr, size_t old_size, size_t new_size)
>> {
>> - return scm_realloc (old_ptr, new_size);
>> + return scm_gc_realloc (old_ptr, old_size, new_size, "GMP");
>> }
>>
>> static void
>> custom_gmp_free (void *ptr, size_t size)
>> {
>> - free (ptr);
>> + /* Do nothing: all memory allocated by GMP is under GC control and
>> + will be freed when needed. */
>> }
>
> I think this makes sense to me as a short-term fix. The down-side is
> that limbs can alias Scheme objects.
Yes.
To my surprise, on a pure bignum microbenchmark, this is
counterproductive:
--8<---------------cut here---------------start------------->8---
$ guile ~/src/guile-debugging/bignum-finalizers.scm # 3.0.0
clock utime stime cutime cstime gctime
2.42 6.20 0.17 0.00 0.00 5.62
heap size: 2.0 MiB
$ /data/src/guile-3.0/meta/guile ~/src/guile-debugging/bignum-finalizers.scm
clock utime stime cutime cstime gctime
3.97 10.91 0.15 0.00 0.00 10.60
heap size: 3.0 MiB
$ cat ~/src/guile-debugging/bignum-finalizers.scm
(use-modules (ice-9 time))
(time
(let loop ((n (expt 2 18))
(i 1))
(unless (zero? n)
;; (display ".")
(loop (- n 1)
(logior 0 (ash i 1))))))
(format #t "heap size: ~a MiB~%"
(round
(/ (assoc-ref (gc-stats) 'heap-size) (expt 2. 20))))
--8<---------------cut here---------------end--------------->8---
(Here we’re creating ~24 bignums, no more.)
I wonder if there’s another part of the story that I’m missing here.
Perf report for 3.0.0:
--8<---------------cut here---------------start------------->8---
46.93% guile libgc.so.1.3.6 [.] GC_mark_from
17.61% guile libgc.so.1.3.6 [.] GC_header_cache_miss
9.96% guile libgc.so.1.3.6 [.] GC_add_to_black_list_normal
5.20% guile libgmp.so.10.3.2 [.] __gmpn_lshift_coreisbr
4.13% guile libgc.so.1.3.6 [.] GC_find_header
2.28% guile libgc.so.1.3.6 [.] GC_finalize
2.09% guile libgc.so.1.3.6 [.] GC_base
--8<---------------cut here---------------end--------------->8---
With the patch:
--8<---------------cut here---------------start------------->8---
48.40% guile libgc.so.1.3.6 [.] GC_mark_from
17.74% guile libgc.so.1.3.6 [.] GC_header_cache_miss
11.90% guile libgc.so.1.3.6 [.]
GC_add_to_black_list_normal
4.45% guile libgc.so.1.3.6 [.] GC_find_header
2.31% guile libgmp.so.10.3.2 [.] __gmpn_lshift_coreisbr
2.30% guile libgc.so.1.3.6 [.] GC_base
1.73% guile libgc.so.1.3.6 [.] GC_finalize
--8<---------------cut here---------------end--------------->8---
IOW, the relative part of computations drops from 5% to 2%.
Thoughts?
> In the long-term I think we should be representing bignums as
> pointerless objects whose first word is the tag and a word count,
> followed by inline "limbs" (in the sense of
> https://gmplib.org/manual/Nomenclature-and-Types.html#Nomenclature-and-Types).
> Generally we can use the low-level API to work on these
> (https://gmplib.org/manual/Low_002dlevel-Functions.html#Low_002dlevel-Functions),
> and if we need to use mpz_t, we can easily create an mpz_t that points
> to these values.
Yes, that sounds like the right approach longer-term. Note that ‘mpz_t’
is exposed through “numbers.h”, which I guess means we cannot change
that in 3.0.x.
Thanks,
Ludo’.