[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal: block-based vector allocator

From: Stefan Monnier
Subject: Re: Proposal: block-based vector allocator
Date: Fri, 09 Dec 2011 16:04:26 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (gnu/linux)

>> Your test with the *compilation* seems to indicate otherwise, or am
>> I missing something?
> "Not too much" for byte-compile test. Since *compilation* buffer test
> allocates mostly an intervals, mmap() helps it regardless of the method
> used for vector allocation.

Thanks for the clarification.

> I don't understand your question.

It's really me who didn't understand your code.  For some reason
I thought you didn't just have segregated free lists, but also
segregated blocks (I guess the reason was that segregated blocks were
partly designed to reduce problems of fragmentation, and since we were
talking earlier about fragmentation...).

> It's _your_ proposal to divide objects on classes by rounding the size
> up to nearest magic value ('standard malloc algorithm').

Yours is also very standard, indeed (probably something like "best-fit
allocation with segregated free list").

> My proposal assumes that vector block can contains objects
> of any size, and I'm using segregated free lists for ..., 600, 608, 616, 624,
> 632, ... bytes free vectors.

BTW, your code seems to compute the free list index by dividing by
word_size (= 4 on 32bit systems) rather than by 8, so half of the slots
will always be unused, IIUC.

> Did you ever tried to obtain a histogram of vector sizes? Of course, it
> highly depends on the benchmark; but I'm pretty sure that 2..6 - Lisp_Objects
> vectors are very popular for the wide range of real workloads.  That's
> another reason why throwing away even 32 bytes is not so reasonable.

Using segregated blocks will waste some bytes because of rounding and
padding, but you can choose the percentage you're willing to waste.
So you may sometimes waste a 32B area, but it's only once per 4KB block,
so it is of no consequence.

Both segregated and non-segregated blocks have their advantages, both in
terms of CPU and memory use (e.g. the segregated blocks will waste some space
because of rounding, but they save space by removing the need for the
"next" (aka nbytes/buffer/vector) field).

> I'm scheduling the global benchmark for the night (32 vs. 64-bit
> binaries, GCPROs vs.stack marking, block-based vector allocation vs.
> current code, in all possible variants, 8 executables in total,
> 16 runs for each :-)).

Let us how it turns out.

> +#ifndef PAGE_SIZE
> +#define PAGE_SIZE 4096
> +#endif

If it's an arbitrary choice, it should not be called "page_size" but
"vector_block_size" or some such.  Otherwise, we should fetch its value
from the OS.

> +/* Vector blocks are aligned to page size, so it's easy to
> +   get vector block pointer from the vector pointer.  */

Where do you make use of this alignment?

> +  /* Although lisp_align_malloc should be used, the default
> +     1K granularity looks too small for vector allocation.  */

Yes, it would be nice to reuse it, indeed.

>  live_vector_p (struct mem_node *m, void *p)
>  {
> -  return (p == m->start && m->type == MEM_TYPE_VECTORLIKE);
> +  if (m->type == MEM_TYPE_VECTORLIKE)
> +    {
> +      if (m->end - m->start == VECTOR_BLOCK_BYTES)

This test looks dangerous (we if we allocate a large vector of this
size?).  I guess your code currently can't allocate such a "small large
vector", but if so, we should at least have a comment here.  And we
could trivially remove this assumption by adding a new

> +       /* Scan the whole vector block and see whether P points to
> +          the start of some vector which is not on a free list.  */
> +       while ((unsigned char *) vector < block->bump)
> +         {
> +           if ((vector->header.size & VECTOR_FREE_LIST_SIZE)
> +               == VECTOR_FREE_LIST_SIZE)
> +             vector = ADVANCE
> +               (vector, (vector->header.size & (PAGE_SIZE - 1)));
> +           else if (vector == p)
> +             return 1;
> +           else
> +             vector = ADVANCE (vector, vector->header.next.nbytes);
> +         }
> +     }

Hmm... I didn't think of it, but this is a disadvantage of
non-segregated blocks: during conservative scanning, you need to scan
the vector block to determine if we really have a live_vector_p whereas
segregated blocks would just divide the offset by the block's
vector size.

For small enough vector blocks, it's probably not a big deal, but for
10-slot vectors on 32bit systems with 4KB blocks, that's about
100 iterations.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]