gnugo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnugo-devel] Floating point arithmetics


From: Gunnar Farnebäck
Subject: Re: [gnugo-devel] Floating point arithmetics
Date: Fri, 17 Sep 2004 01:45:56 +0200
User-agent: EMH/1.14.1 SEMI/1.14.3 (Ushinoya) FLIM/1.14.2 (Yagi-Nishiguchi) APEL/10.3 Emacs/21.3 (sparc-sun-solaris2.9) MULE/5.0 (SAKAKI)

Nando wrote:
> I rapidly concluded that there must be an underlying problem with the
> connection code. And I strongly suspected floating point arithmetics.

Yes, this problem is known from the SPEC work. Maybe we should have
warned you. :-)

> After some debugging, I could spot a location where things could (and
> actually do) go wrong, in the ENQUEUE() macro. The first comparison
> involves values which haven't been normalized, with the consequence that
> the delta, vulnerable1 and vulnerable2 fields might (or might not) get
> overwritten, leading to possible variations in the further processing
> of the queue.
> 
> As a possible solution, I rejected the idea of spreading lots of
> gg_normalize_float() calls throughout the code. It seemed much simple
> and efficient to transform the floating point arithmetic into a fixed
> point one (well, sort of). So I wrote a simple patch, just replacing
> float declarations by int ones, and scaling all the constants by 10000
> (smallest constant found in the ENQUEUE_STONE() macro).

Yes, going fixed point is the only sane long-term solution.

> Testing the patch resulted in :
> 
> * Positive
> 
> - Nodes counts are almost identical on both Linux and Win32 (there are
>   still a couple deltas in trevora, nngs and nngs3, which means there
>   are problems elsewhere)

There might be similar issues in the influence code.

> - Regression breakage is identical (the century2002:150 problem on Win32
>   has disappeared)

That's good.

> - Regression breakage is apparently positive compared to CVS, with 1
>   FAIL and 3 PASSes (not analyzed yet, but at first glance, the PASSes
>   all look good)
> 
>   break_in:100    FAIL 1 D9 [0]
>   nngs:1280       PASS D13 [D13]
>   connect:70      PASS 0 [0]
>   global:1        PASS B3 [B3]

I wouldn't really worry about analyzing these, except possibly for
getting ideas about future readconnect tuning.

> * Negative
> 
> - Performance impact is heavy : +2% or so in reading nodes,
>                                 +5.7% connection nodes,
>                                 timing around +5% (imprecise)
> 
>   My guess is that with CVS and the above mentioned problem in
>   ENQUEUE(), there is quite a number of cases where vulnerabilities
>   are overwritten, globally resulting in less checks and readings.

I say we take this hit. Any differences in how the code works is
clearly accidental.

> - A possible issue for us developers : tuning the constants will be
>   less natural than with floating points.

Yes, this is why I wrote the code with floating point numbers in the
first place.

If it's possible I would still like to retain the illusion of floating
point numbers. Something like

#define FIXED_POINT_BASIS 10000
#define FP(x) ((int) (0.5 + FIXED_POINT_BASIS * (x)))
#define FIXED_TO_FLOAT(x) ((x) / (float) FIXED_POINT_BASIS)

The latter macro would only be used in debug outputs, e.g.

  gprintf("%o  %f, primary distance\n", FIXED_TO_FLOAT(distance));

I'm open for a better name for the FP macro but it has to be short and
unintrusive, so it can be used transparently like

  if (board[bpos] == EMPTY
      && board[apos] == EMPTY && board[gpos] == EMPTY
      && conn->distances[bpos] > distance + FP(1.3)) {
    ENQUEUE(conn, pos, bpos, distance + FP(1.3), FP(1.0), apos, gpos);
  }

Since the FP macro should only be used for literal constants, the
compilers should be able to compute the corresponding integers at
compile time. Remaining question, is there any way those sneaky
floating point numbers could still bite us with this construction?
Arend, do you see any potential problem?

> Questions :
> 
> 1. Are we interested by this patch, even at the mentioned performance
>    cost ?

Yes.

> 2. If I submit a patch, should I make the change reversible ? In other
>    words, provide typedefs and #define's so as to be able to switch ?

No.

>    To be honest, I don't see any good reason we'd possibly want to go
>    back to floating points, but maybe someone on the list has better
>    ideas on the topic.

Either we are happy with the solution or we are not. Maintaining two
ways to do it in parallel is not warranted in this case.

/Gunnar




reply via email to

[Prev in Thread] Current Thread [Next in Thread]