[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnucap-devel] orderings
Re: [Gnucap-devel] orderings
Sat, 10 Nov 2018 07:32:46 +0100
On Fri, Nov 09, 2018 at 06:33:49PM -0500, al davis wrote:
> > when the incidence graph is not constructed (forward, reverse), there is
> > some overhead in the virtual iwant calls. i can't measure it. (but
> > that's it).
> The run time overhead of a virtual call is that of a real function
> call, which means if it was going to be a real function call anyway the
> virtual overhead is insignificant. It becomes significant when it
> would have been inlined otherwise
yes, currently, the iwant calls (to the matrix) are inlined. now, in
plug_order they are not. my observation is, that the overhead is not
significant. this will be more irrelevant if the ordering could be done
before expand, as you suggest below.
> Also, with small circuits this doesn't matter. It's only when the
> circuits get large that it matters.
it all happens during CARD_LIST::*_iwant, and the relative overhead
there is independent of the circuit size.
> I am not sure what that means. What I see is arbitrary small
> differences that seem to be related to rounding errors. Which is
> really better could go either way. I can't tell without further study.
perhaps no longer necessary.
> It looks to me that the incmode example still doesn't work.
the original test works, mainly the dc convergence. i have added it to
the branch. still, it is a fluke. it needs more thought to actually take
numerical stability into account. ("incmode" is the wrong name for the
> I think you will eventually conclude as I did back in 1985, and again
> back in 1992. ... that the fancy ordering algorithms really don't
> work as well as the simple ones.
> The existing ordering algorithm is not nothing at all. It's a "nearest
> neighbor" algorithm, on read-in. When you need to create a new node,
> just take the next one in line.
This works for handwritten circuits. and i am not questioning how well
it works. in some situations, manual intervention is really hard. the
failing circuit has been created from a random (but simple enough)
> when considering the run time of the ordering algorithm.
it's not too bad. the untweaked global rcmk takes 0.01s on eq7 (589825
nodes). as a result the total simulation time drops from 0m30.252s to
0m26.289s. the drop seems to be purely due to density, but still.
> To improve the algorithm, I think the best approach is to make one more
> pass to sort nodes within a block by degree, and leave it at that. You
> could do multiple passes, but I think even one fixed pass would do.
> Even that should be done before expansion.
Certainly anything done before expansion will be (much) more efficient.
i couldn't figure it out. it will likely introduce some other overhead.
> Suppose we have a degree array for a block something like this:
> 21432.33 ... 5 internal nodes, two connection nodes.
> The two connection nodes cannot be moved, but the internal nodes can
> be. The ideal order would have a degree array: 12234.33.
> I really don't think you can do better than this.
> (Yes I know .. it's a challenge to try).
perhaps not in the 5 node example. i have seen extracted subcircuits
with many more nodes. even if i can't do better than random, rcmk (or
anything like that) might well do better.
i am not sure i fully understand the implications on incmode, especially
when circuits are deeply nested.
> Now .. how does all this relate to incmode? The sort I just proposed
> improves it, and that should be the highest priority.
well, I somehow need to store the matrix stamps locally and lift the
order to pre-expand.
> To further improve the algorithm, note how nonlinear the various
> connections are, and take the most linear ones first.
my idea was to prioritize large elements on the diagonal. looks like a
similar objective. lets discuss this another time.