Re: map-par slower than map

guile-user

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: map-par slower than map

From:	Damien Mattei
Subject:	Re: map-par slower than map
Date:	Sat, 22 Oct 2022 18:01:22 +0200

just for the fun things, i just find that the parallelized results are
false :-) even for low indices:
the non // result: C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4))
is different than the parallelized result below:
C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4))

i have to check things again ,i think futures are easy to use but limited
in solving acute // problems, such as using a global variable like an hash
table...

Damien



On Mon, Oct 17, 2022 at 3:17 PM Damien Mattei <damien.mattei@gmail.com>
wrote:

> Hello,
>
> sorry for my late answer ( i wrote the code friday but i could only debug
> today, being busy and on travel last week-end)
>
> i modified my code to works with 'futures' , the speed is now incredible
> !!! (and it computes exact)
> the speed seems multiplied by even more than the number of CPUs (6):
> cat /proc/cpuinfo  | grep processor
> processor : 0
> processor : 1
> processor : 2
> processor : 3
> processor : 4
> processor : 5
>
>
> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1900
>
> (declare minterms-vector unified-minterms-vector-1)
>
> (define (funct-unify-minterms-set-1-unit-future set1 set2)
>
>   {function-unify-minterms-list <+ (λ (L) (apply
> function-unify-two-minterms-and-tag L))}
>
>   ;; note : sorting is useless
>
>   {minterms-set <+ (product-set-with-set-imperative-sorted set1 set2)}
> ;;(product-set-with-set-imperative set1 set2)} ;;(product-set-with-set set1
> set2)} ;;(associate-set-with-set set1 set2)} ;; set multiplication : create
> list of pair of minterms
>
>   {minterms-vector <- (list->vector minterms-set)}
>
>   {minterms-vector-length <+ (vector-length minterms-vector)}
>
>   {nb-procs <+ (current-processor-count)}
>
>   {segmts <+ (segment 0 {minterms-vector-length - 1} nb-procs)} ;; compute
> the segments
>
>   {unified-minterms-vector-1 <- (make-vector minterms-vector-length #f)}
>
>   (run-in-parallel segmts proc-unify-minterms-seg) ;; run the parallel code
>
>   {unified-minterms-set-1 <+ (vector->list unified-minterms-vector-1)}
>
>   {unified-minterms-set-2 <+ (filter (λ (x) x) unified-minterms-set-1)} ;;
> remove #f results
>
>   {unified-minterms-set <+ (remove-duplicates-sorted
> unified-minterms-set-2)} ;; uniq
>
>   unified-minterms-set)
>
> i keeped your 'segment' definitions to index an array of data after
> convert it from list to vector (list->vector) which probably consume
> additional time on big data list ( i had more than 1000000 element list
> lengths at some point)
>
> i used a simplified version of run-in parallel (that do not do 'reduce
> after processing data):
>
>
> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1794
>
> the computation on a segment is done by those functions:
>
>
> https://github.com/damien-mattei/library-FunctProg/blob/master/guile/logiki%2B.scm#L1842
>
> {function-unify-minterms-list <+ (λ (L) (apply
> function-unify-two-minterms-and-tag L))}
>
> ;; proc to be called with futures
> (define (proc-unify-minterms-seg seg)
>   {start <+ (segment-start seg)}
>   {end <+ (segment-end seg)}
>   (for ({i <+ start} {i <= end} {i <- {i + 1}})
>        {mtL <+ {minterms-vector[i]}}
>        {unified-minterms-vector-1[i] <- (function-unify-minterms-list
> mtL)}))
>
>
> i use those code in another program symbolically to compute a value named
> Cn:
>
> C0 = T
> C1 = T
> C2 = T
> C3 = (B1 ∨ B2)
> C4 = (B2 ∨ (B1 ∧ B3))
> C5 = ((B1 ∧ B3) ∨ (B2 ∧ B3) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4))
> C6 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4) ∨ (B3 ∧ B4) ∨ (B4 ∧ B5))
> C7 = ((B1 ∧ B3 ∧ B5) ∨ (B2 ∧ B3 ∧ B5) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧ B4 ∧ B6) ∨
> (B4 ∧ B5) ∨ (B5 ∧ B6))
> C8 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6) ∨ (B3 ∧
> B4 ∧ B6) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6) ∨ (B6 ∧ B7))
> C9 = ((B1 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B3 ∧ B5 ∧ B7) ∨ (B2 ∧ B4 ∧ B6 ∧ B8) ∨
> (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧ B7) ∨ (B7 ∧
> B8))
>
> from C1 to C9 used to be fast,less than  a minute for the whole with or
> without //
>
> C10 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B2 ∧ B4 ∧ B6
> ∧ B8) ∨ (B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B5 ∧ B6 ∧ B8) ∨ (B6 ∧
> B7 ∧ B9) ∨ (B7 ∧ B8) ∨ (B8 ∧ B9))
>
> C10 takes a few minutes
>
> but C11 used to take one day before // with 'future' and i got now the
> result during less than one hour!!!
>
> C11 = ((B1 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B2 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B3 ∧
> B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8) ∨ (B10 ∧ B9) ∨ (B2 ∧
> B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B4 ∧ B5 ∧ B7 ∧ B9) ∨ (B6 ∧ B7 ∧ B9) ∨ (B8 ∧ B9))
>
> i never got C12 in the past ,even with 7 days of computation! and i got it
> today during the lunchbreak !!!
>
> C12 = ((B1 ∧ B11 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B10 ∧ B11) ∨ (B10 ∧ B2 ∧ B4 ∧ B6
> ∧ B8) ∨ (B10 ∧ B3 ∧ B4 ∧ B6 ∧ B8) ∨ (B10 ∧ B5 ∧ B6 ∧ B8) ∨ (B10 ∧ B7 ∧ B8)
> ∨ (B10 ∧ B9) ∨ (B11 ∧ B2 ∧ B3 ∧ B5 ∧ B7 ∧ B9) ∨ (B11 ∧ B4 ∧ B5 ∧ B7 ∧ B9) ∨
> (B11 ∧ B6 ∧ B7 ∧ B9) ∨ (B11 ∧ B8 ∧ B9))
>
> (note that a wise people can find a general formula for Cn based on Cn-1
> but that is another (mathematical) story....)
>
> i'm computing C13 but i see that it consumes 12gb out of 16gb of my system
> ! and stopped on the linux box :
> GC Warning: Repeated allocation of very large block (appr. size 510935040):
> May lead to memory leak and poor performance
> Processus arrêté
> but still running on the Mac laptop...ok will see that but i think that
> the data set is now huge and there is noting that can be done with that...
>
> note that there is still an hash table used in
> function-unify-two-minterms-and-tag and i will use another data structure
> and algo to avoid that, i feared that the concurrent access to hash table
> can cause the guile 'future' to crash or fails or to slow down the process
> but not! result is incredibly fast.Also i use continuation and i read it
> can cause problem with 'future' i will improve that too...
>
> I will see where i can improve the algo and data structure to optimize
> again but this is already really good.
>
> Thanks
>
> Damien
>
>
> On Fri, Oct 14, 2022 at 10:39 AM Zelphir Kaltstahl <
> zelphirkaltstahl@posteo.de> wrote:
>
>> Hello Damien,
>> On 10/14/22 10:21, Damien Mattei wrote:
>>
>> yes Zelphir i think there is a problem in the threading part of guile,
>> whatever the code is ,it run well the first time, and after is unstable.
>> Long ago at the very begin of Java language on my master project at
>> university i experienced same problem with Java "green" threads, under
>> windows and/or linux ,i can't remember.
>>
>> I'm trying your code and future, which have perheaps also the portability
>> with other schemes, future can provide "light" // , with carefull code it
>> could be ok.
>>
>> in your segment.scm code ,is segment-count possibly replaced by the
>> number of available CPUs or nodes, if i understand well?
>>
>> Regards,
>> Damien
>>
>> Correct. For each segment one future is used.
>>
>> However, there are scenarios, where one would rather want to interleave
>> the numbers of the whole range, to have a "fairer" workload per future, for
>> example:
>>
>> (1 2 3 4 5 6 7 8 9)
>>
>> -> (1 4 7), (2 5 8), (3 6 9)
>>
>> instead of
>>
>> -> (1 2 3) (4 5 6) (7 8 9)
>>
>> (I seem to remember this to be the case for Mandelbrot set calculations,
>> but might be wrong.)
>>
>> But that is all a matter of forming some segments and what a segment is,
>> not really part of the logic of creating futures. So one could create then
>> in any way that fits ones use-case. I have not generalized that segment
>> logic that far, but it is doable.
>>
>> Anyway, if anyone more knowledgeable could chime in on what the
>> performance differences between futures and parallel map are, that would be
>> great! Or pointing out some flaws that this kind of future usage might
>> have. Or when the point is reached to switch to guile-fibers library.
>>
>> Regards,
>> Zelphir
>>
>> --
>> repositories: https://notabug.org/ZelphirKaltstahl
>>
>>

[Prev in Thread]

Current Thread

[Next in Thread]

map-par slower than map, Damien Mattei, 2022/10/12
- Re: map-par slower than map, Maxime Devos, 2022/10/12
  - Re: map-par slower than map, Damien Mattei, 2022/10/12
    - Re: map-par slower than map, Damien Mattei, 2022/10/12
    - Re: map-par slower than map, Zelphir Kaltstahl, 2022/10/12
    - Re: map-par slower than map, Damien Mattei, 2022/10/14
    - Re: map-par slower than map, Zelphir Kaltstahl, 2022/10/14
    - Re: map-par slower than map, Damien Mattei, 2022/10/17
    - Re: map-par slower than map, Damien Mattei <=
    - Re: map-par slower than map, Damien Mattei, 2022/10/22
    - Re: map-par slower than map, Zelphir Kaltstahl, 2022/10/23
    - Re: map-par slower than map, Keith Wright, 2022/10/23
    - Re: map-par slower than map, Zelphir Kaltstahl, 2022/10/24
    - Re: map-par slower than map, Damien Mattei, 2022/10/24
    - Re: map-par slower than map, Mikael Djurfeldt, 2022/10/25
    - Re: map-par slower than map, Mikael Djurfeldt, 2022/10/25
    - Re: map-par slower than map, Damien Mattei, 2022/10/25
    - Re: map-par slower than map, Maxime Devos, 2022/10/12
- Re: map-par slower than map, Olivier Dion, 2022/10/12

Prev by Date: Re: Does the eval-when example work?
Next by Date: Re: map-par slower than map
Previous by thread: Re: map-par slower than map
Next by thread: Re: map-par slower than map
Index(es):
- Date
- Thread