gnucap-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnucap-devel] [Parralelisms]


From: beranger six
Subject: Re: [Gnucap-devel] [Parralelisms]
Date: Tue, 18 Mar 2014 14:58:06 +0100
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0


Le 05/03/2014 18:00, Felix a écrit :
it might be tricky to find a good partition, and your tree-view might
give the right ideas. to me it does not make sense to schedule
everything in advance. for example, you may want to start with just 4
threads, and take the first of T1..T4 that finishes to become T5. you
cannot know which, as it depends on which nodes have changed
(a.is_changed(mm)).

if there is a matrix where no such partition induces a useful
parallelization scheme, i'd be interested to know!

Thank you for the answer, you are right, the dependance tree might be not the good way for implementation. I think i had in mind massively parallelism instead of simple parralelism.
I will try to implement in openmp a more simple solution.

Le 08/03/2014 18:00, address@hidden a écrit :
I think you are making it harder than it needs to be and
overlooking why/where it could be of benefit.

That sample matrix is not in the form you actually get.

The form you get is bordered-block-diagonal, possibly
hierarchical.  By hierarchical I mean you will find a BBD form
inside a block.
Ok i will change my example to do test.
Blocks can be done in parallel with other blocks, then the

In the netlist, blocks can be identified by subcircuit/module
instantiations.  Borders can be identified by the next higher
level and subcircuit/module calls.

The netlist node list is an adjacency matrix.

This stuff is available.  It will be faster and easier to use it
rather than to try to extract that from the matrix.  By faster,
I mean both faster to run and faster to develop.

Any block smaller that about 100 nodes is not worth decomposing
further.  Just assume it is connected.
I will try to check , if it necessary . Futhermore i think we don't more threads than the number of threads that can run simultaneously(num_procs * hyper_threading factor ) so most probably a decomposition into 16 threads will be enough for the current processor architecture.
If you have a block that is big enough to split, the way to
split is to look for ways to move nodes to a border.  This is
the opposite of what traditional ordering methods do.

The block requirement for parallel is the same as for
incremental update.

Global ordering takes at best O(n log(n)) time.  This is too
slow.  The way to make it faster is local ordering and reuse.

Gnucap's ordering is non-optimal, but needs to stay early in the
process.  It should be done before expansion.  That is one
reason I never did anything with the post-expand ordering hook.

And sorry for the late answer , i have administrative stuff to do at work that i always pospone. I come to you to let you know how i progress.

Thanks fo your reply.

--
Beranger Six,
CCamy système
06 33 16 10 17




reply via email to

[Prev in Thread] Current Thread [Next in Thread]