bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Putting Build info into sgf file


From: Michael Petch
Subject: Re: [Bug-gnubg] Putting Build info into sgf file
Date: Tue, 18 Aug 2009 13:36:59 -0600
User-agent: Microsoft-Entourage/12.20.0.090605



On 18/08/09 12:39 PM, "Dice_R_Random" <address@hidden> wrote:

> 
> When GNU writes the .sgf file, it outputs the version number:
> 
> AP[GNU Backgammon:0.90-mingw]
> 
> Could you please add the build number to this?
> 

A build number is fine but is very difficult by itself when some of your
users may roll there own personal copies. Power users like me build from CVS
snapshots etc, may turn some features on and off. There aren't many
"official builds" for other platforms besides windows.

As well the tarball for each official build should be archived away (Or
stamped in CVS) so that it can be rebuilt later.

You may ask yourself, Mike if you provide the source for each official build
users who wish can roll there own from that and rebuild on any platform they
choose. You'd be correct but here where things get dicey. Given any specific
set of source code, when people build it on their own system the results may
very. The compiler itself may optimize for the machine being built on for
instance (very common), or the gcc compiler is a different version.
Optimizations may be turned off by the user when they build, and some may
not (They can easily be overridden).

Next you may ask why should optimizations for instance make any difference.
Well, some compilers generate code but sometimes the optimizations can
actually be buggy, or create unwanted side effects (especially with
threading and memory access). Case in point is the bug Keene found with the
bot making a clearly bad checker play to save gammon. Some people who
analyzed it saw the right answer and some saw the wrong answer. Ultimately
it was discovered that an optimization the compiler made on some platforms
(including windows) did not generate the proper outcome (In this case it
appears to me to be a compiler optimization bug, not a GnuBG bug). Sometimes
of course there might be a long standing bug in Gnubg and may only surface
depending on optimizations used.

To make things even worse, some builds of software can actually run
differently on different equipment. There is a feature that can be turned on
or off called "SSE/CPU autodetect". It tries to determine if your system has
SSE support at runtime (not compile time) and it will alter the code being
run on the fly. If there was a bug that crept into the SSE enabled code and
didn't appear in the non-SSE part then the results of the same program
running on two different systems could theoretically be different! Second
classic example is threading. Users can alter the behavior by setting the
threads. There may (or not be) threaded bugs in the code that have yet to be
identified that may in some circumstances yield differing results.

This last part could probably be helped by putting software builds through a
set of standard test cases. For instance I ould probably (for each official
release) create a set of test cases (that may be based on some real matches,
or positions that have known to give bad/differing results in the past). I
would build a baseline release with No optimizations and No SSE support and
then compare the results with the same source code built with the normal
optimizations and SSE settings.

If someone thinks such test cases and product verification might be a good
thing, I would be willing to render my help in that area.

> I am involved with several rollout projects using GNU and several rollouts
> have come into question.  This project is using many people, many computers,
> and many versions of GNU.

For products (Not just GnuBG), if you have many people on many different
systems with many different releases it becomes very difficult to combine
all the data and expect the same/reproducible results. GnuBG is constantly
changing and sometimes the bugs that get fixed alter  evaluations and
rollouts! In your case you are doing rollouts. Here is  a bug that was fixed
in January that was related to a bug in Variance reduction. People usign
versions of GnuBG before that fix and after may get differign results (I use
this as an example only):

Sat Jan 17 21:57:31 CET 2009 Christian Anthon <address@hidden>

    * rollout.c: Fix a small bug in BasicRollout. The first move (0-ply)
    in the variance reduction was cubeless, regardless of the cube
    setting. Only the cubeful 0-ply rollouts should be affected with
    infinite trials, but the variance reduction will change a bit for all
    rollouts.

Another example:

Wed Apr 16 22:05:24 CEST 2008 Christian Anthon <address@hidden>

    * eval.c: in FindBestCubeDecision. Fix rare case where no double,
    beaver was returned as doble, beaver.

When there are significant changes the Changelog file will be updated by the
developers. A copy is installed into the directory you install GnuBG into
(On windows) and Is also available in each CVS snapshot and CVS itself.

I guess my main point would be. Allowing your users to use a hodge podge of
releases is probably not a good idea. As bugs are fixed that effect the
neural net or the evals, or rollouts they may impact the desired outcome in
your projects (like rollouts). I think the same can be said for any product
that may be used this way. I would force those who involve themselves on a
project to use a specific and verified build release.

> It would be nice that if it were determined that
> build number xyz was questionable that we could determine which rollouts
> used that build and reroll them.
> 
> I would also like to see the number of threads used and I believe that those
> are the only variables that I am concerned with.
> 

I think build options, compile options, compiler flags, build environment
data would all be useful information if you were trying to understand what
went into a particular release. Or put a build number in the product name
(as you suggest), and have an online repository of each build number, the
source code tarball (or a CVS tag), and the Make/Configure options used to
do the official release. If the users roll their own, they should supply you
with all the build data so it can be reviewed.

If it were me, I'd start by getting people to use the same release, and
stick to it unless your project leader (it may be you) informs the users
helping on the project that a new release is acceptable.

Just my 10cents worth.
Michael.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]