Overview: Parallelization with OpenMP*

The Intel® C++ Compiler supports the OpenMP* C++ version 2.0 API specification. OpenMP provides symmetric multiprocessing (SMP) with the following major features:

The Intel C++ Compiler performs transformations to generate multithreaded code based on the user's placement of OpenMP directives in the source program making it easy to add threading to existing software. The Intel compiler supports all of the current industry-standard OpenMP directives, except WORKSHARE, and compiles parallel programs annotated with OpenMP directives. In addition, the Intel C++ Compiler provides Intel-specific extensions to the OpenMP C++ version 2.0 specification including  run-time library routines and environment variables.

Note

As with many advanced features of compilers, you must properly understand the functionality of the OpenMP directives in order to use them effectively and avoid unwanted program behavior.

See parallelization options summary for all of the options of the OpenMP feature in the Intel C++ Compiler.

For complete information on the OpenMP standard, visit the OpenMP Web site at http://www.openmp.org. For OpenMP* C++ version 2.0 API specifications, see http://www.openmp.org/specs/.

Parallel Processing with OpenMP

To compile with OpenMP, you need to prepare your program by annotating the code with OpenMP directives. The Intel C++ Compiler first processes the application and produces a multithreaded version of the code which is then compiled. The output is a executable program with the parallelism implemented by threads that execute parallel regions or constructs.

Targeting a Processor Run-time Check

While parallelzing a loop, the Intel compiler's loop parallelizer, OpenMP, tries to determine the optimal set of configurations for a given processor. At run time, a check is performed to determine for which IA-32 processor OpenMP should optimize a given loop. See detailed information in the Processor-specific Runtime Checks, IA-32 Systems.

Performance Analysis

For performance analysis of your program, you can use the Intel® VTune(TM) Performance Analyzer to show performance information. You can obtain detailed information about which portions of the code require the largest amount of time to execute and where parallel performance problems are located.