HLO Example

Before HLO

After HLO (loop unrolling)

for (j=1; j<1000; j++) {
   y(j) = y(j) + a*x(j)
}

for (j=1; j<1000; j+=2) {
   y(j) = y(j) + a*x(j)
   y(j+1) = y(j+1) + a*x(j+1)
}

The code on the left uses an ldf (load floating-point), an Itanium® processor assembly instruction that loads a single floating point value into a register.

The code on the right uses an ldfp (load floating-point pair), an Itanium® architecture assembly instruction that loads two floating-point values into two registers simultaneously.