|From:||Dr . Jürgen Sauermann|
|Subject:||Re: [Bug-apl] segfault when using 'CORE_COUNT_WANTED' configure flag|
|Date:||Thu, 17 Oct 2019 16:48:30 +0200|
|User-agent:||Mozilla/5.0 (X11; Linux i686; rv:60.0) Gecko/20100101 Thunderbird/60.6.1|
as a matter of fact, the loops in my benchmarks are small, but
the data on which these small loops operate is not. Practically this
means that all instructions run from the instruction cache (with an
instruction cache hit rate of 100%) but at the same time the data cache
hit rate is low.
A parallel APL program the suits the boundary conditions of both caches
would have a small code footprint (a short APL loop to suit the instruction
cache) but at the same time operate on few APL variables of small size
(to suit the data cache). Although one could probably construct such a
program for the sole purpose of benchmarking, its benefit would be limited
to the marketing of the interpreter, but not for the speeding-up real-life programs.
I am still waiting for the point in time where memory (not only caches) come
with the CPU (like numeric co-processors in the 1990s) and thenit is time to
reconsider parallel APL.
On 10/17/19 12:57 PM, Blake McBride wrote:
|[Prev in Thread]||Current Thread||[Next in Thread]|