[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] gr::buffer::allocate_buffer: warning

From: Michael Dickens
Subject: Re: [Discuss-gnuradio] gr::buffer::allocate_buffer: warning
Date: Tue, 21 Apr 2015 15:36:30 -0400

On Tue, Apr 21, 2015, at 03:12 PM, Marcus Müller wrote:
> By the way: This currently *is* getting more interesting: Because you
> typically don't want to copy memory needlessly in a
> performance-critical application, it's bad that blocks that wrap some
> kind of accelerator (GPU, FPGA card, DSP core...) can't define where
> their buffers are -- so there's work going on in the coprocessors
> working group (Doug Geiger is the person to ask, I guess) to allow
> single blocks to define their own special buffers.

Doug Geiger has lead the CoProc working-group (WG) effort for a while,
he's time limited as of recent, as are most of the usual candidates for
this work. I -might- pick up the torch in May, if/as my time allows;
we'll see. If there's demand for doing this work, it would help.

The CoProc work is basically to create egress and ingress base block
that provide their own specialized buffers. They would be able to use
the current double-buffered type, or a single-buffer if that's all
that's available.

*** The concepts of the double buffer that we current use
include (assuming the request was a buffer of N items):

+ good: we can always guarantee that N items are available for R/W, no
  matter where the R/W pointers are, by allocating the buffers somewhat
  larger (2x) than the request (and, rounded up to the nearest
  pagesize() boundary);
+ good: buffer wrap -- when the R/W pointers are moved -- is a
  simple remainder computation; no memcpy or even branching required;
- bad: not all OSs / hardware easily provide these buffer types, or
getting them
  requires root access, or whatnot.

*** The concepts of the single buffer include:

+ good: really easy to guarantee that N items are available for R/W
- bad: have to use memcpy eventually for buffer wrap; but, can mitigate
  this issue by allocating the buffer to be, say, 10x larger than
  needed so that memcpy happens only 1:10 of the time; we don't in
  general want to allocated large buffers all of the time, but if this
  is the way to get GR working on some systems then it's a good option
  to have in place; will require branching somewhere in the process;
+ good: can use any memory, anywhere, which makes it more portable
  across OSs & hardware.

*** egress is the transport from the local CPU/memory to the CoProc,
which might be as simple as using the same shared memory & not even
having to memcpy / DMA; or, it might be more complicated, such as using
OpenCL to set up a memory map, moving the data over, then closing the

*** ingress is the transport from the CoProc to local CPU/memory; just
the reverse of egress.

*** The WG decided to limit the use cases for now to just these 2; if
this work does get done, and there's a need for CoProc blocks where the
scheduler is involved, then we'll address those cases at that time.

Anyway, those are the basic ideas. I'd love to hear some more discussion
& folks interested in having this change in place. - MLD

reply via email to

[Prev in Thread] Current Thread [Next in Thread]