qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 26/72] softfloat: Convert float128_silence_nan to parts


From: Richard Henderson
Subject: Re: [PATCH 26/72] softfloat: Convert float128_silence_nan to parts
Date: Thu, 13 May 2021 07:25:54 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1

On 5/13/21 3:34 AM, Alex Bennée wrote:

Richard Henderson <richard.henderson@linaro.org> writes:

This is the minimal change that also introduces float128_params,
float128_unpack_raw, and float128_pack_raw without running into
unused symbol Werrors.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
  fpu/softfloat.c                | 96 +++++++++++++++++++++++++++++-----
  fpu/softfloat-specialize.c.inc | 25 +++------
  2 files changed, 89 insertions(+), 32 deletions(-)

diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 2d6f61ee7a..073b80d502 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -500,14 +500,12 @@ static inline __attribute__((unused)) bool 
is_qnan(FloatClass c)
  }
/*
- * Structure holding all of the decomposed parts of a float. The
- * exponent is unbiased and the fraction is normalized. All
- * calculations are done with a 64 bit fraction and then rounded as
- * appropriate for the final format.
+ * Structure holding all of the decomposed parts of a float.
+ * The exponent is unbiased and the fraction is normalized.
   *
- * Thanks to the packed FloatClass a decent compiler should be able to
- * fit the whole structure into registers and avoid using the stack
- * for parameter passing.
+ * The fraction words are stored in big-endian word ordering,
+ * so that truncation from a larger format to a smaller format
+ * can be done simply by ignoring subsequent elements.
   */
typedef struct {
@@ -526,6 +524,15 @@ typedef struct {
      };
  } FloatParts64;
+typedef struct {
+    FloatClass cls;
+    bool sign;
+    int32_t exp;
+    uint64_t frac_hi;
+    uint64_t frac_lo;
+} FloatParts128;
+
+/* These apply to the most significant word of each FloatPartsN. */
  #define DECOMPOSED_BINARY_POINT    63
  #define DECOMPOSED_IMPLICIT_BIT    (1ull << DECOMPOSED_BINARY_POINT)
@@ -561,11 +568,11 @@ typedef struct {
      .exp_bias       = ((1 << E) - 1) >> 1,                           \
      .exp_max        = (1 << E) - 1,                                  \
      .frac_size      = F,                                             \
-    .frac_shift     = DECOMPOSED_BINARY_POINT - F,                   \
-    .frac_lsb       = 1ull << (DECOMPOSED_BINARY_POINT - F),         \
-    .frac_lsbm1     = 1ull << ((DECOMPOSED_BINARY_POINT - F) - 1),   \
-    .round_mask     = (1ull << (DECOMPOSED_BINARY_POINT - F)) - 1,   \
-    .roundeven_mask = (2ull << (DECOMPOSED_BINARY_POINT - F)) - 1
+    .frac_shift     = (-F - 1) & 63,                                 \
+    .frac_lsb       = 1ull << ((-F - 1) & 63),                       \
+    .frac_lsbm1     = 1ull << ((-F - 2) & 63),                       \
+    .round_mask     = (1ull << ((-F - 1) & 63)) - 1,                 \
+    .roundeven_mask = (2ull << ((-F - 1) & 63)) - 1


I have to admit I find the switch to (-F - 1) & 63 a little black
magical. Isn't the shift always going to end up a factor of the number
of exponent bits we need to move past and the natural size of the
original float?

Yep. But now we're looking to compute the number relative to .frac_lo, rather than the entire logical fraction.


r~


Anyway my personal brain twisting aside it obviously works and
everything else looks fine so:

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>





reply via email to

[Prev in Thread] Current Thread [Next in Thread]