Optimization Options

The optimization options let you specify how to optimize your applications for speed, particular processors, code size, and so forth.

For more information about optimization, see "Compiler Optimizations Overview" and related sections in the Intel Fortran User's Guide for Linux Volume II: Optimizing Applications.

See also Floating-Point Options.

Descriptions of Optimization Options

-assume [no]buffered_io

Default: -assume nobuffered_io (buffer is flushed as each record is written)

Specifies whether records are written (flushed) to disk as each is written or are accumulated in the buffer. If you specify -assume buffered_io, records accumulate in the buffer.

For disk devices, -assume buffered_io (or the equivalent OPEN statement BUFFERED='YES' specifier or the FORT_BUFFERED run-time environment variable) requests that the internal buffer will be filled, possibly by many record output statements (WRITE), before it is written to disk by the Fortran run-time system. If a file is opened for direct access, I/O buffering will be ignored.

Using buffered writes usually makes disk I/O more efficient by writing larger blocks of data to the disk less often. However, if you request buffered writes, records not yet written to disk may be lost in the event of a system failure.

Unless you set the FORT_BUFFERED environment variable to true, the default is BUFFERED='NO' and -assume nobuffered_io for all I/O, in which case the Fortran run-time system empties its internal buffer for each WRITE (or similar record output statements).

The OPEN statement BUFFERED specifier applies to a specific logical unit. In contrast, the -assume [no]buffered_io option and the FORT_BUFFERED environment variable apply to all Fortran units.

-auto_ilp32 (Itanium-based systems only)

Default: Off

Allows the compiler to use 32-bit pointers whenever possible as long as the application does not exceed a 32-bit address space.

Because this optimization requires interprocedural analysis over the whole program, you must use this option with the -ipo option.

Using this option on programs that exceed 32-bit address space may cause unpredictable results during program execution.

-ax{K|W|N|B|P} (IA-32 systems only)

Default: None.

Directs the compiler to find opportunities to generate separate versions of functions that take advantage of features that are specific to the specified Intel® processor.

If the compiler finds such an opportunity, it first checks whether generating a processor-specific version of a function is likely to result in a performance gain. If this is the case, the compiler generates both a processor-specific version of a function and a generic version of the function. The generic version will run on any IA-32 processor.

At run time, one of the versions is chosen to execute, depending on the Intel processor in use. In this way, the program can benefit from performance gains on more advanced Intel processors, while still working properly on older IA-32 processors.

Possible values and the processors the code is optimized for are:

-complex_limited_range[-]

Default: Off (-complex_limited_range-)

Enables the use of basic algebraic expansions of some arithmetic operations involving data of type COMPLEX. This can result in performance improvements in programs that use a lot of COMPLEX arithmetic. However, values at the extremes of the exponent range might not compute correctly.

-f[no-]alias

Default: -falias

Specifies that aliasing should be assumed in the program.

See also -f[no-]fnalias.

-f[no-]fnalias

Default: -ffnalias

Specifies that aliasing should be assumed within functions. The -fno-fnalias option specifies that aliasiing should not be assumed within functions, but should be assumed across calls.

See also -f[no-]alias.

-fast

Default: Off

Provides a shortcut method to enable several optimizations for run-time performance.

The -fast option sets the following options to improve performance:

To get the best possible performance, you might need to use the option in conjunction with an architecture-specific option such as -xN.

To override one of the options set by -fast, specify that option after the -fast option on the command line.

Note

The several options set by the -fast option might change from release to release.

-fnsplit[-] (Itanium®-based systems only)

Default: Off

Enables function splitting if -prof_use. is also enabled. (This option has no effect if -prof_use. is not enabled.)

This option is automatically enabled if you use -prof_use.

To turn off function splitting, use -fnsplit-. (However, function grouping will continue to be enabled.)

See also these topics in Volume II:
Basic PGO Options
Example of Profile-Guided Optimization

-fp (IA-32 systems only)

Default: On

Disables the use of ebp as a general-purpose register.

Most debuggers expect ebp to  be used as a stack frame pointer, and cannot produce a stack backtrace unless this is so. This option allows frame pointers and disables the use of the ebp register in optimizations and lets the debugger produce a stack backtrace.

-gp

Default: Off

Alternate syntax: -p

Compile and link for function profiling with the gprof tool.

-ip

Default: Off

Enables single-file interprocedural optimizations.

Enhances inline function expansion.

See also this topic in Volume II: "Using -ip with -option Specifiers."

-ip_no_inlining

Default: Off

Disables interprocedural inlining that results from the -ip or -ipo interprocedural optimizations, but has no effect on other interprocedural optimizations. Requires -ip or -ipo.

-ip_no_pinlining (IA-32 systems only)

Default: Off

Disables partial inlining. Requires -ip or -ipo.

-ipo

Default: Off

Enables Whole Program Optimization (WPO), which is the same as multifile interprocedural optimization, or multifile IPO. All objects over the entire program are compiled.

See also these topics in Volume II:
Multifile IPO Overview
Creating a Multifile IPO Executable with xilink
Using -Qip with -Qoption Specifiers

-ipo_c

Default: Off

Optimizes across files and produces a multifile object file. Stops prior to the final link stage, leaving an optimized object file.

See also this topic in Volume II: Analyzing the Effects of Multifile IPO.

-ipo_obj

Default: Off

Forces the generation of real object files. Requires -ipo. See also this topic in Volume II: Compilation with Real Object Files.

-ipo_S

Default: Off

Optimizes across files and produces a multifile assembly file. Performs the same optimizations as -ipo, but stops prior to the final link stage, leaving an optimized assembly file. The default listing name is ipo_out.s.

See also this topic in Volume II: Analyzing the Effects of Multifile IPO.

-ivdep_parallel (Itanium®-based systems only)

Default: Off

Specifies that there is no loop-carried memory dependency in the loop where an IVDEP directive is specified. This technique is useful for some sparse matrix applications.

See also this topic in Volume II: Memory Dependency with the IVDEP Directive.

-nolib_inline

Default: On

Disables inline expansion of intrinsic functions.

-On

Default: -O2 unless you specify -debug, in which case the default is -O0

Specifies the code optimization for application types. Possible values are:

On IA-32 systems, -O1, -O2, and -O are equivalent.

On Itanium-based systems, -O2 and -O are equivalent.

Note

The last -On option specified on the command line takes precedence over any others.

-Obn

Default: -Ob1 (unless -Od is specified, in which case -Ob0 is the default)

Specifies the level of inline function expansion. Inlining procedures can greatly improve the run-time performance for certain programs.

Possible values for n are:

-opt_report

Default: Off

Generates an optimization report to stderr.

See also this topic in Volume II: "Optimizer Report Generation."

-opt_report_file file

Default: Off

Generates an optimization report and specifies the file name for the report. You do not need to specify -opt_report if you use this option.

See also this topic in Volume II: "Optimizer Report Generation."

-opt_report_help

Default: Off

Displays the optimization phases available for reporting.

See also this topic in Volume II: "Optimizer Report Generation."

-opt_report_level {min|med|max}

Default: -opt_report_level min

Specifies the detail level of the optimization report.

See also this topic in Volume II: "Optimizer Report Generation."

-opt_report_phase phase

Default: Off

Specifies the optimization phase to generate the report for. Can be specified multiple times on the command line for multiple optimizations.

See also this topic in Volume II: "Optimizer Report Generation."

-opt_report_routine [routine]

Default: Off

Generates reports from all routines with names containing routine as part of their name.

If the optional routine is not specified, reports from all routines are generated.

See also this topic in Volume II: "Optimizer Report Generation."

-par_threshold n

Default: -par_threshold 100

Sets a threshold for the auto-parallelization of loops based on the probability of profitable parallel execution. n can be from 0 through 100.

n=0: loops get auto-parallelized regardless of computation work volume, that is, always.

n=100: loops get auto-parallelized only if profitable parallel execution is almost certain.

See also these topics in Volume II:

Auto-Parallelization Threshold Control and Diagnostics
Auto-Parallelization Overview
Auto-Parallelization: Enabling, Options, Directives, and Environment Variables

-parallel

Default: Off

Enables the auto-parallelizer to generate multithreaded code for loops that can be safely executed in parallel. To use this option, you must also specify -O2 or -O3.

See also these topics in Volume II:
Auto-Parallelization Overview
Auto-Parallelization: Enabling, Options, Directives, and Environment Variables

-prefetch[-] (IA-32 systems only)

Default: -prefetch (on)

Enables prefetch insertion optimization. The goal of prefetching is to reduce cache misses by providing hints to the processor about when data should be loaded into the cache. Note that -O3 must be specified for this option to work.

To disable the prefetch insertion optimization, use -prefetch-.

-prof_dir dir

Default: The directory where the program is compiled.

Specifies the directory in which you intend to place the profiling output files (.dyn and .dpi) to be created. The specified directory must already exist.

See also these topics in Volume II:
Advanced PGO Options
Specific Coding Guidelines for IA-32 Architecture

-prof_file file

Default: Source file name with extension .dyn and .dpi

Specifies the file name for the profiling summary file.

See also these topics in Volume II:
Advanced PGO Options
Specific Coding Guidelines for IA-32 Architecture

-prof_gen

Default: Off

Instruments a program for profiling to get the execution count of each basic block.

See also these topics in Volume II:
Basic PGO Options
Example of Profile-Guided Optimization

-prof_use

Default: Off

Enables use of profiling information (including function splitting and function grouping) during optimization. Instructs the compiler to produce a profile-optimized executable and merges available profiling output files into a pgopti.dpi file.

If you use this option, it automatically enables -fnsplit[-].

Note that there is no way to turn off function grouping if you enable it using this option.

See also these topics in Volume II:
Basic PGO Options
Example of Profile-Guided Optimization

-scalar_rep[-] (IA-32 systems only)

Default: -scalar_rep (on)

Enables scalar replacement performed during loop transformation. Requires -O3.

-tppn

Default value for IA-32 systems: -tpp7

Default value for Itanium®-based systems: -tpp2

Optimizes for a particular Intel® processor. The executable will run on other processors, but is optimized for processors noted below. Possible choices for n are:

-unroll[n]

Default: -unroll (lets the compiler decide)

Specifies the maximum number of times to unroll a loop.

Possible values are:

-x{K|W|N|B|P} (IA-32 systems only)

Default: None.

Lets you target your program to run on a specific Intel processor. The resulting code might contain unconditional use of features that are not supported on other processors.

Possible values and the processors the code is optimized for are:

To execute the program on x86 processors not provided by Intel Corporation, do not specify this option.

Caution

If a program compiled with this option is executed on a processor that lacks the specified set of instructions, it can fail with an illegal instruction exception, or display other unexpected behavior. In particular, programs compiled with -xN, -xB, or -xP will emit run-time errors if they are executed on unsupported processors.