Manual CPU Dispatch (IA-32 only)

Use __declspec(cpu_specific) and __declspec(cpu_dispatch) in your code to generate instructions specific to the Intel processor on which the application is running, and also to execute correctly on other IA-32 processors.

Note

Manual CPU dispatch cannot be used to recognize Intel® Itanium® processors. The syntax of these extended attributes is as follows:

The values for cpuid  and cpuid-list are shown in the tables below:

Processor Values for cpuid
x86 processors not provided by Intel Corporation generic
Intel® Pentium® processors pentium
Intel Pentium processors with MMX™ Technology pentium_mmx
Intel Pentium Pro processors pentium_pro
Intel Pentium II processors pentium_ii
Intel Pentium III processors pentium_iii
Intel Pentium III (exclude xmm registers) pentium_iii_no_xmm_regs
Intel Pentium 4 processors pentium_4
Intel Pentium M processors pentium_m
Intel processors code-named "Prescott". future_cpu_10

 

Values for cpuid-list
cpuid
cpuid-list, cpuid

The attributes are not case sensitive. The body of a function declared with __declspec(cpu_dispatch) must be empty, and is referred to as a stub (an empty-bodied function).

Use the following guidelines to implement automatic processor dispatch support:

  1. Stub for cpu_dispatch must have a cpuid defined in cpu_specific elsewhere
    If the cpu_dispatch stub for a function f contains the cpuid p, then a cpu_specific definition of f with cpuid p must appear somewhere in the program; otherwise an unresolved external error is reported. A cpu_specific function definition need not appear in the same translation unit as the corresponding cpu_dispatch stub, unless the cpu_specific function is declared static. The inline attribute is disabled for all cpu_specific and cpu_dispatch functions.
  2. Must have a stub for cpu_specific function
    If a function f is defined as __declspec(cpu_specific(p)), then a cpu_dispatch stub must also appear for f within the program; and p must be in the cpuid-list of that stub; otherwise, that cpu_specific definition cannot be called nor generate an error condition.
  3. Overrides command line settings
    When a cpu_dispatch stub is compiled, its body is replaced with code that determines the processor on which the program is running, then dispatches the "best" cpu_specific implementation available as defined by the cpuid-list. The cpu_specific function optimizes to the specified Intel processor regardless of command-line option settings.

Processor Dispatch Example

Here is an example of how these features can be used:

#include <mmintrin.h>

/* Pentium processor function does not use intrinsics to add two arrays. */

 

__declspec(cpu_specific(pentium))

void array_sum(int *r, int *a, int *b,size_t l)

{

   for (; length > 0; l--)

   *result++ = *a++ + *b++;

}

 

/* Implementation for a Pentium processor with MMX technology uses

an MMX instruction intrinsic to add four elements simultaneously. */

 

__declspec(cpu_specific(pentium_MMX))

void array_sum(int *r,int const *a, int *b, size_t l)

{

   __m64 *mmx_result = (__m64 *)result;

   __m64 const *mmx_a = (__m64 const *)a;

   __m64 const *mmx_b = (__m64 const *)b;

 

   for (; length > 3; length -= 4)

   *mmx_result++ = _mm_add_pi16(*mmx_a++, *mmx_b++);

 

   /* The following code, which takes care of excess elements, is not

   needed if the array sizes passed are known to be multiples of four. */

 

   result = (unsigned short *)mmx_r;

   a = (unsigned short const *)mmx_a;

   b = (unsigned short const *)mmx_b;

 

   for (; length > 0; l--)

   *result++ = *a++ + *b++;

}

 

__declspec(cpu_dispatch(pentium, pentium_MMX))

void array_sum (int *r,int const *a, int *b, size_t l) )

 

{

 

/* Empty function body informs the compiler to generate the

CPU-dispatch function listed in the cpu_dispatch clause. */

 

}