[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: memchr2 speed, gcc

From: Brian Dessent
Subject: Re: memchr2 speed, gcc
Date: Mon, 03 Mar 2008 19:20:26 -0800

Bruno Haible wrote:

> Btw, how do you need to write code such that gcc uses the SSE3 instructions?

You mean auto-vectorization, as opposed to explicitly using the
mmintrin.h or __builtin_foo APIs?  I think you need to specify a -march=
that names an architecture that has sse3 (or just -msse3, but that
should be implied by an appropriate -march=) as well as
-ftree-vectorize.  I think that -ftree-vectorize is enabled at -O3 but
I'm not positive.

Two other notes: starting with 4.2, the gcc default -mtune= is now
'generic' (instead of the old default of pentiumpro) which is meant to
be a blended tuning that is appropriate for a wide class of today's most
common architectures - Athlon, Opteron, Pentium M, Pentium 4, and Core
2.  Thus with gcc >= 4.2 you would expect to see less difference between
[no -mtune= specified] and [-mtune=athlon specified] than with older
versions given this new default.

Also, gcc >= 4.2 offers -mtune=native and -march=native which sets the
arch and tune respectively to whatever is appropriate for the host
machine, based on cpuid.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]