guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: fibers,questions about thread id and mutation of vectors


From: Damien Mattei
Subject: Re: fibers,questions about thread id and mutation of vectors
Date: Tue, 17 Jan 2023 10:42:33 +0100

Hello Maxime,
 it runs in the fastest way with your idea,
as you said it scm_init_guile() is only needed once by thread.


On Fri, Jan 13, 2023 at 1:23 PM Maxime Devos <maximedevos@telenet.be> wrote:
>   for (i=start; i<=stop; i++)  { /* i is private by default */
>
>     scm_init_guile();
>     scm_call_1( func , scm_from_int(i) );

IIUC, you are calling scm_init_guile once per index, whereas calling it

yes openMP slice a 1 to N for loop in N/number_of_cpus segments of normal C for loop but run one loop per CPUs so if you do a 'top' command on a C openMP code you will see a load of number_of_cpus*100%
for example with 12 cpus top will then display a load for your program of 1200% furthermore if you hit the 1 key you would see in top the load of each CPU (100% each) the same options does not exist with 'top' of BSD like Mac OS.

OpenMP do a partition of N and run exactly each part on  one thread ,each thread on a different CPU or core, i think it is the only library that can do that , OpenMP is written very near of the compiler and LLVM.

In general there is a Master thread and slave threads or you can run a special code only on the first thread to fork (master one or the first to launch) and friday unfortunately i tried the single pragma:
but that can not help becaus it run only on the first thread.

a solution of the problem could be this one:

Executing Code Once Per Thread in an OpenMP Loop


but it is (Visual C++) and even with g++  this would be not compatible.

so i use a basic C solution with static and array that keep in memory if the scm_init_guile()  as already been launch for the current thread the code is running now.
I also put omp_get_max_threads() in a static var as openmp() is called many times in my codes and the number of available hardware cpus would change never.

the code is here:

unfortunately i find no real speed up, i understood that the only reason of speed up was because the C 'for loop is much faster than the Scheme 'for ones.

For this concclusion i compared Scheme and C openmp and C without openMP and in C i got exactly the same time results:

Scheme:
... [output cut]
Chrono START number: 165 minterms-vector-length = 10944. chrono STOP : elapsedTime = 36.219 ms.totalComputationTime =485311.94
Chrono START number: 166 minterms-vector-length = 12008. chrono STOP : elapsedTime = 39.82 ms.totalComputationTime =485351.76
Chrono START number: 167 minterms-vector-length = 342. chrono STOP : elapsedTime = 1.215 ms.totalComputationTime =485352.97500000003

Scheme with OpenMP call:
...[output cut]
Chrono START number: 165 minterms-vector-length = 10944. chrono STOP : elapsedTime = 35.039 ms.Open MP totalComputationTime =385444.1410000001
Chrono START number: 166 minterms-vector-length = 12008. chrono STOP : elapsedTime = 37.792 ms.Open MP totalComputationTime =385481.93300000014
Chrono START number: 167 minterms-vector-length = 342. chrono STOP : elapsedTime = 1.163 ms.Open MP totalComputationTime =385483.09600000014



Scheme with C 'for loop call:
...[output cut]
Chrono START number: 165 minterms-vector-length = 10944. chrono STOP : elapsedTime = 33.104 ms.For Funct totalComputationTime =385543.4700000001
Chrono START number: 166 minterms-vector-length = 12008. chrono STOP : elapsedTime = 35.938 ms.For Funct totalComputationTime =385579.4080000001
Chrono START number: 167 minterms-vector-length = 342. chrono STOP : elapsedTime = 1.165 ms.For Funct totalComputationTime =385580.5730000001

on the C codes (// openmp and sequenctial for) the result
is almost the same :
totalComputationTime =385580.5730000001 ms
totalComputationTime =385483.09600000014 ms
=385 s

i suppose openMP works well by slicing on many processors but the scm_call_1( func , scm_from_int(i) );
works all on the same thread that host the Guile interpreter.
Solution would be to have many Guile interpreter running but i do not know how doing that from the C code with OpenMP.

Damien

note : i did time measure both in C and Scheme with gettimeofday code to compare both 100% scheme code and mixed one:


reply via email to

[Prev in Thread] Current Thread [Next in Thread]