[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lmi] rounding unit test under Linux/x86-64 (was: trivial patch to fix f

From: Vadim Zeitlin
Subject: [lmi] rounding unit test under Linux/x86-64 (was: trivial patch to fix fenv test compilation under Linux)
Date: Thu, 25 Mar 2010 22:21:19 +0100

On Wed, 24 Mar 2010 11:39:39 +0000 Greg Chicares <address@hidden> wrote:

GC> > all tests compile under Linux again now. Unfortunately they still
GC> > don't pass, even this one fails with
GC> > 
GC> >   **** test failed:   '-3' == '-2'
GC> >   [file fenv_lmi_test.cpp, line 211]
GC> > 
GC> >   **** test failed:   '1' == '2'
GC> >   [file fenv_lmi_test.cpp, line 213]
GC> [...]
GC> > (note that the line numbers are offset by the above patch compared to the
GC> > svn version) but I ran out of time to investigate it today.
GC> Here's the applicable part of the code:
GC>     fenv_rounding   (fe_downward);
GC>     BOOST_TEST_EQUAL(fe_downward  , fenv_rounding());
GC>     BOOST_TEST_EQUAL(-3, rint(-2.5));
GC>     BOOST_TEST_EQUAL(-2, rint(-1.5));
GC>     BOOST_TEST_EQUAL( 1, rint( 1.5));
GC>     BOOST_TEST_EQUAL( 2, rint( 2.5));
GC> and it's interesting that two of the tests pass but two fail.
GC> The failures seem to be here:
GC>     BOOST_TEST_EQUAL(-3, rint(-2.5)); // '-2' observed
GC>     BOOST_TEST_EQUAL( 1, rint( 1.5)); //  '2' observed
GC> and that pattern seems to suggest a round-to-nearest mode,
GC> so I'd suspect a problem in fenv_rounding() because it doesn't
GC> seem possible that glibc implements rint() incorrectly for
GC> such obvious testcases...unless they're not trying to be
GC> compatible with C99?

 Actually the problem is not in fenv_rounding() but is due to the fact that
the FPU control word is not being used at all. Under Linux/x86-64 platform
math operations are implemented using SSE2 as it's faster and always
available for all 64 bit CPUs, quoting -mfpmath gcc option documentation

            Generate floating point arithmetics for selected unit unit. The
            choices for unit are:

                Use the standard 387 floating point coprocessor present
                majority of chips and emulated otherwise.
                This is the default choice for i386 compiler.
                Use scalar floating point instructions present in the SSE
                instruction set.
                The resulting code should be considerably faster in the
                majority of cases and avoid the numerical instability
                problems of 387 code, but may break some existing code that
                expects temporaries to be 80bit. 
                This is the default choice for the x86-64 compiler. 

But in fact this is mostly what finally allowed me to find the problem (I
looked for a long time in wrong places...) and is probably irrelevant to
what really goes on as rint() implementation in GNU libc is not compiled by
gcc anyhow but written directly in assembler. I'm not sure about where
exactly is the implementation of rint() found in glibc sources (I tried to
find it there but got lost...) but looking at the disassembly of rint() on
my system in gdb I do see SSE instructions being used and not the x87 ones.

 And, quoting from "Intel® 64 and IA-32 Architectures Software Developer’s
Manual, Volume 1" (http://developer.intel.com/Assets/PDF/manual/253665.pdf) Rounding Control (RC) Fields

        In the Intel 64 and IA-32 architectures, the rounding mode is
        controlled by a 2-bit rounding-control (RC) field (Table 4-8 shows
        the encoding of this field). The RC field is implemented in two
        different locations:
                * x87 FPU control register (bits 10 and 11)
                * The MXCSR register (bits 13 and 14)

        Although these two RC fields perform the same function, they
        control rounding for different execution environments within the
        processor. The RC field in the x87 FPU control register controls
        rounding for computations performed with the x87 FPU instructions;
        the RC field in the MXCSR register controls rounding for SIMD
        floating point computations performed with the SSE/SSE2

So SSE instructions are completely unaffected by the changes to the FPU
control word. Setting MXCSR correct does make the test work (I tested this
manually under gdb) and looking at sysdeps/x86_64/fpu/fesetround.c I see
that it changes both places at once, just as I'd expect:

        fesetround (int round)
          unsigned short int cw;
          int mxcsr;

          if ((round & ~0xc00) != 0)
            /* ROUND is no valid rounding mode.  */
            return 1;

          /* First set the x87 FPU.  */
          asm ("fnstcw %0" : "=m" (*&cw));
          cw &= ~0xc00;
          cw |= round;
          asm ("fldcw %0" : : "m" (*&cw));

          /* And now the MSCSR register for SSE, the precision is at different 
             positions in the different units, we need to shift it 3 bits.  */
          asm ("stmxcsr %0" : "=m" (*&mxcsr));
          mxcsr &= ~ 0x6000;
          mxcsr |= round << 3;
          asm ("ldmxcsr %0" : : "m" (*&mxcsr));

          return 0;

 Of course, we could do the same ourselves in LMI-specific code. But I
really think that if we're interested in fixing this at all, we should just
use fesetround() i.e. fix the

        // SOMEDAY !! Revisit suppressed __STDC_IEC_559__ support. See:
        //   http://lists.nongnu.org/archive/html/lmi/2008-06/msg00033.html

comment from fenv_lmi.hpp.

 For me a more interesting question is whether we actually want to fix it.
AFAIK currently LMI is neither used nor compiled under 64 bit platforms at
all so it's probably not a priority. OTOH it's not totally impossible that
one day you may wish to enabled SSE support even in 32 bit builds, i.e. use
-mfpmath=sse2 option, for performance reasons and it would be a nasty
surprise to (re)discover this breakage then. And unfortunately I don't know
of any way to test, during compilation, if SSE instructions are being used
so we can't even generate a #error or something.

 Please let me know if you'd like me to enable LMI_IEC_559 and retest with


reply via email to

[Prev in Thread] Current Thread [Next in Thread]