[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug gas/18700] Wrong code generated on i386 for instructions with AVX-5

From: m at rolle dot name
Subject: [Bug gas/18700] Wrong code generated on i386 for instructions with AVX-512 {sae}
Date: Tue, 21 Jul 2015 06:36:17 +0000


--- Comment #1 from Michael Rolle <m at rolle dot name> ---
First off, I might be misinterpreting the Intel doc (319433-022
OCTOBER 2014), in which case gas might be doing the right thing after all.

I wrote some EVEX instructions, both scalar and vector, and both with and
without the {sae} option.  Here's the disassembly of the result:

   0:   62 f1 f5 18 c2 ca 00    vcmpeqpd {sae},%zmm2,%zmm1,%k1
   7:   62 f1 f5 48 c2 ca 00    vcmpeqpd %zmm2,%zmm1,%k1
   e:   62 f1 f7 18 c2 ca 00    vcmpeqsd {sae},%xmm2,%xmm1,%k1
  15:   62 f1 f7 08 c2 ca 00    vcmpeqsd %xmm2,%xmm1,%k1

  1c:   62 f1 f5 18 c2 ca 00    vcmpeqpd {sae},%zmm2,%zmm1,%k1
  23:   62 f1 f5 38 c2 ca 00    vcmpeqpd {sae},%zmm2,%zmm1,%k1
  2a:   62 f1 f5 58 c2 ca 00    vcmpeqpd {sae},%zmm2,%zmm1,%k1
  31:   62 f1 f5 78 c2 ca 00    vcmpeqpd {sae},%zmm2,%zmm1,%k1
  38:   62 f1 f7 18 c2 ca 00    vcmpeqsd {sae},%xmm2,%xmm1,%k1
  3f:   62 f1 f7 38 c2 ca 00    vcmpeqsd {sae},%xmm2,%xmm1,%k1
  46:   62 f1 f7 58 c2 ca 00    vcmpeqsd {sae},%xmm2,%xmm1,%k1
  4d:   62 f1 f7 78 c2 ca 00    vcmpeqsd {sae},%xmm2,%xmm1,%k1

The first four were assembled from the instructions shown in the disassembly.

The last eight were assembled with .byte instructions, just to see how the
disassembler would treat them.  objdump basically ignores L'L as you can see. 
This is appropriate for vcmpsd, but perhaps not for vcmppd.

The error, I believe, is with the L'L bits in the 4th EVEX byte.  For the first
instruction at address 0, the byte is 0 00 1 1 000 in binary.  L'L = 00b.

My reading of the doc is that L'L = 10b is required, and the instruction will
#UD otherwise.

The intel doc says these things in particular about this.

(1) Page 7. 

4.6.3 SAE Support in EVEX
The EVEX encoding system allows arithmetic floating-point instructions without
rounding semantic to be encoded
with the SAE attribute. This capability applies to scalar and all vector
lengths, by setting EVEX.b.

Table 4-7. EVEX Embedded Broadcast/Rounding/SAE and Vector Length on Vector
Position P2[4] P2[6:5] P2[6:5]
Broadcast/Rounding/SAE Context          EVEX.b        EVEX.L’L 

FP Instructions w/o rounding semantic,  SAE control   00b: 128-bit
can cause #XF                                         01b: 256-bit
                                                      10b: 512-bit
                                                      11b: Reserved (#UD)

(2) Page 15.

4.10.2 Exceptions Type E2 of EVEX-Encoded Instructions
(This includes vcmppd)

Invalid Opcode, #UD (in certain modes), ...
            If EVEX.L’L != 10b (VL=512).

(3) Page 60.

The details for the VCMPPD instruction shows {sae} ONLY in the 512-bit version,
not in the 128- or 256-bit EVEX-encoded versions.

The reasons I am a bit doubtful of the doc is that there are some
inconsistencies, indicating that Intel may have rushed the document out to

First of all, it's strange that only the 512-bit vector instructions allow
{sae}, even though it says that all vector lengths are supported, and that the
L'L encodes the vector length.

Another thing is that in the Exceptions Type E3, which includes vcmpsd, it says
that there is a #UD if EVEX.b = 1.  This is clearly wrong.  It lists the
encoding as LIG, meaning L'L is ignored; that's fine, and consistent with the
lack of Exceptions Type E3 conditions for L'L.


I think the only real way to resolve this is to get hold of one of the new
Intel processors that supports AVX-512, and try running these instructions to
see if you get a #UD.  And if vcmppd runs with L'L = 00b or 01b, see if it
actually compares the entire zmm registers or only the xmm/ymm registers, resp.

If indeed, you get a #UD with L'L = 00b, as produced by gas, then this is a
critical bug.

You are receiving this mail because:
You are on the CC list for the bug.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]