[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: master 669aeaf: Prefer make_nil_vector to make-vector with nil
From: |
Paul Eggert |
Subject: |
Re: master 669aeaf: Prefer make_nil_vector to make-vector with nil |
Date: |
Sat, 15 Aug 2020 11:48:09 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 |
On 8/12/20 6:05 AM, Pip Cet wrote:
bugs like the one we currently have in fns.c (Fdelete allocates an
uninitialized vector, then calls Fequal, which can call quit, leaving
a half-initialized vector on the stack and potentially marked by
conservative GC) are bound to bite us one day
Good catch; I hadn't noticed this GC-related bug. I looked through the Emacs
source code for similar instances of the bug, and fixed the ones I found with
the first attached patch.
Particularly for
small-ish fixed-size vectors, it seems to me uninitialized vectors are
more trouble than they're worth (in fact, I could see make_vector (n,
Qnil) being faster on many CPUs, because the cache lines are written
to completely and don't have to be brought in from RAM.
I don't quite follow. If Fmake_vector (n, Qnil) invokes memset to clear the
memory, then it should work the same way as (make_uninit_vector followed by
immediate initialization) as far as the hardware cache is concerned. And if
Fmake_vector (n, Qnil) doesn't invoke memset but instead relies on calloc (which
in turn just mmaps /dev/zero or whatever), then Fmake_vector should be slower
than uninitialized vectors due to the cache effects that you mention. (If this
is a significant performance issue with Fmake_vector then we should fix it, but
that issue is independent of this discussion.)
Of the 40 places in *.c that use make_uninit_vector, only three look
like there might be a tangible benefit: alloc.c, fns.c, and pdumper.c
(but looking over that last one, I don't understand how
hash_table_contents is functionally different from Fcopy_sequence
(h->key_and_value) at this point, with the obvious mutation in
hash_table_rehash).
If we changed hash_table_contents to use Fcopy_sequence, wouldn't
hash_table_rehash become a bit slower? hash_table_rehash would need to look at
the entire sequence instead of just its first first h->count elements.
That being said, you're right that the Emacs code was using make_uninit_vector
too often, even aside from the GC bugs noted above. Inspired by your comment, I
went through the Emacs source and replaced all calls to make_uninit_vector (and
allocate_vector) that were easy to replace and didn't seem to make any
performance difference, by installing the second attached patch.
This patch isn't as aggressive as your comment suggested, though, as it doesn't
replace code like this from charset.c:
val = make_uninit_vector (8);
for (i = 0; i < 8; i++)
ASET (val, i, make_fixnum (code_space[i]));
Here, changing make_uninit_vector to make_nil_vector initializes unnecessarily,
there's no chance of GC before the vector is initialized, and readability is not
significantly improved by changing the code to something like the following:
val = CALLN (Fvector,
make_fixnum (code_space[0]), make_fixnum (code_space[1]),
make_fixnum (code_space[2]), make_fixnum (code_space[3]),
make_fixnum (code_space[4]), make_fixnum (code_space[5]),
make_fixnum (code_space[6]), make_fixnum (code_space[7]));
0001-Fix-GC-bugs-related-to-uninitialized-vectors.patch
Description: Text Data
0002-Prefer-Fvector-to-make_uninit_vector.patch
Description: Text Data