Vectorization Support (IA-32)

The vector directives control the vectorization of the subsequent loop in the program, but the compiler does not apply them to nested loops. Each nested loop needs its own directive preceding it. You must place the vector directive before the loop control statement.

vector always Directive

The vector always directive instructs the compiler to override any efficiency heuristic during the decision to vectorize or not, and will vectorize non-unit strides or very unaligned memory accesses.

Example of vector always Directive

#pragma vector always

for(i=0; i<=N; i++)

{

a[32*i]=b[99*i];

}

ivdep Directive

The ivdep directive instructs the compiler to ignore assumed vector dependences. To ensure correct code, the compiler treats an assumed dependence as a proven dependence, which prevents vectorization. This directive overrides that decision. Use ivdep only when you know that the assumed loop dependences are safe to ignore. The loop in the example below will not vectorize with the ivdep, since the value of k is not known (vectorization would be illegal if k<0 ).

Example of ivdep Directive

#pragma ivdep

for(i=0; i<m; i++)

{

a[i]=a[i+k]*c;

}

vector aligned Directive

The vector aligned directive means the loop should be vectorized, if it is legal to do so, ignoring normal heuristic decisions about profitability. When the aligned or unaligned qualifier is used, the loop should be vectorized using aligned or unaligned operations. Specify either aligned or unaligned, but not both.

Caution

If you specify aligned as an argument, you must be absolutely sure that the loop will be vectorizable using this instruction. Otherwise, the compiler will generate incorrect code. The loop in the example below uses the aligned qualifier to request that the loop be vectorized with aligned instructions, as the arrays are declared in such a way that the compiler could not normally prove this would be safe to do so.

Example of vector aligned Directive

#void foo(float *a)

{

#pragma vector aligned

for(i=0; i<m; i++)

{

a[i]=a[i]*c;

}

The compiler includes several alignment strategies in case the alignment of data structures is not known at compile time. A simple example is shown below, but several other strategies are supported as well. If, in the loop shown below, the alignment of a is unknown, the compiler will generate a prelude loop that iterates until the array reference that occurs the most hits an aligned address. This makes the alignment properties of a known, and the vector loop is optimized accordingly.

Example of Alignment Strategies

float *a;

//Alignment unknown

for(i=0; i<100; i++)

{

a[i]=a[i]+1.0f;

}

//Dynamic loop peeling

p=a & 0x0f;

if(p!=0)

{

p=(16-p)/4;

for(i=0; i<p; i++)

{

a[i]=a[i]+1.0f;

}

//Loop with a aligned.

//Will be vectorized accordingly.

for(i=p; i<100; i++)

{

a[i]=a[i]+1.0f;

}

novector Directive

The novector directive specifies that the loop should never be vectorized, even if it is legal to do so. In this example, suppose you know the trip count (ub - lb) is too low to make vectorization worthwhile. You can use novector to tell the compiler not to vectorize, even if the loop is considered vectorizable.

Example of novector Directive

void foo(int lb, int ub)

{

#pragma novector

for(j=lb; j<ub; j++)

{

a[j]=a[j]+b[j];

}