freetype-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: intel compiler support interest?


From: Stephen McDowell
Subject: Re: intel compiler support interest?
Date: Sat, 13 Jun 2020 23:18:35 -0400

Hi Alexei,

It's only __builtin_shuffle that's a problem.  I'm a simd novice at best hehe.  I played around for a good long while trying to find an equivalent shuffle intrinsic, for now I was just working off of the GCC examples for __builtin_shuffle: https://godbolt.org/z/gPiZQL

It's technically successful, but with a big caveat that in order for me to try and translate this to the freetype code I need help understanding how the mask={0,1,1,3} gets transformed into 212 in emitted `pshufd xmm0, xmm0, 212` from the gcc __builtin_shuffle call.  Look for `#define MAGIC` in the example, anything stick out as to how that value is created?  If we know how that is done, I can begin looking into shorts (v82 type used in freetype code) rather than int in the example code.

I'm game to push a little further on it, but to be honest adding in conditional trickery for intel will make this code more confusing.  It's going to have to convert between v82 and one of the _mXXXi vector types and shuffle splitting (can't call _mm_shuffle* with v82 type).  In other words, while intel users may not get the fastest possible code, previously none of this code was vectorized anyway so it's kind of a wash.  That said, I totally understand the desire to vectorize it if we can :)

Let me know your thoughts!

-Stephen


On Sat, Jun 13, 2020 at 2:36 PM Alexei Podtelezhnikov <apodtele@gmail.com> wrote:
On Fri, Jun 12, 2020 at 8:07 AM Stephen McDowell <svenevs.dev@gmail.com> wrote:
> I help maintain the spack package manager when I can, currently users with intel compilers cannot build / install any version after 2.7.1 due to the usage of __builtin_shuffle (for some reason Intel still doesn't support this).

Is there by any chance an equivalent intrinsic?
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#cats=Bit%20Manipulation
What about __builtin_clz that FreeType also uses?

reply via email to

[Prev in Thread] Current Thread [Next in Thread]