[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: volk and alignment

From: Johannes Demel
Subject: Re: volk and alignment
Date: Thu, 9 Jul 2020 14:53:56 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.8.0

Hi Thomas,

with AVX512 we have maximum 64byte alignment. That's the current maximum `volk_get_alignment` could return. Of course, that'll change at some point in the future. So, at the moment we could define this value and hope we'll update it as soon as we introduce our first kernel that uses AVX1024 or smth NEON related. Those return values are defined in [0]. At the moment I'd be in favor of using `aligned_alloc` for aligned allocation since all other implementations have shown issues at some point. But then again relying on standardized functions breaks for non-compliant compilers.

> ```
> char buf[alignment+bytes_needed];
> int adjust = (aligned - (buf % alignment)) % aligned;
> char* p = buf + adjust;
> ```
We had a similar implementation to allocate aligned buffers in case none of the other methods were available. Of course this broke on some machines. Although these issues came up with heap allocation.

`alignas()` would solve this issue if we would always go for this maximum alignment. Do all compilers we target support `alignas`?
What are the pros and cons?


[0] https://github.com/gnuradio/volk/blob/91e5d073532ea6c516d9984e8d4dfcc645fddac8/gen/archs.xml#L291

On 08.07.20 21:48, thomas@habets.se wrote:
On Wed, 8 Jul 2020 18:09:30 +0100, "Marcus Müller" <mueller@kit.edu> said:
  > Is there a maximum size that volk_get_alignment could return, a size
  > that's reasonable?

I'd go with "realistically, yes, but isn't relying on that a bad idea?".

Yes, it does sound like a bad idea. :-)
Really I'm looking to solve the problem, not a specific solution.

I'm thinking back and forth about how to address that problem.
Basically, what we'd need is a "worst case of all available machines"
alignment, that is present in an integer constant expression, so you can
put it into alignas(), right?


It's not my field, but surely 4kiB will align everything? On the other
hand, of course, stepping into a new page may incur a page fault,
which could be more than even using `volk::vector` which may incur an
allocation, but usually won't incur a context switch.

I suppose a mere dynamic stack alloc would do just fine:

char buf[alignment+bytes_needed];
int adjust = (aligned - (buf % alignment)) % aligned;
char* p = buf + adjust;

(except making sure that the pointer arithmetic doesn't cause UB. Off
the top of my head I don't know the right types to use)

Not very nice with two mod ops per time this is needed, though. For
the PR linked to this would happen every sample.

Another option is a thread-local stack, which would make
allocs/deallocs very cheap. Assuming all use cases of this would be
for local variables.

typedef struct me_s {
   char name[]      = { "Thomas Habets" };
   char email[]     = { "thomas@habets.se" };
   char kernel[]    = { "Linux" };
   char *pgpKey[]   = { "http://www.habets.pp.se/pubkey.txt"; };
   char pgp[] = { "9907 8698 8A24 F52F 1C2E  87F6 39A4 9EEA 460A 0169" };
   char coolcmd[]   = { "echo '. ./_&. ./_'>_;. ./_" };
} me_t;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]