Example Auto-Parallelization

First you try auto-parallel with the option...
% ifc -parallel -par_threshold0 Integrate.F90

If you want to receive any messages about what was and was not parallelized...
% ifc -parallel -par_threshold0 -par_report3 Integrate.F90

The compiler will not auto-parallelize sections of your code if there are function calls...
% ifc -parallel -par_threshold0 -par_report3 -ip Integrate.F90

Optimization

  • -ip, -ipo - allow interprocedural optimization, such as inlining, constant arguments, and passes arguments in registers

  • -mp - maintain precision regardless of optimization set

  • -nolib_inline - turns off intrinsic inlining

  • -prefetch - prefetch instructions to reduce cache misses

  • -unroll - unroll loops

  • -O[level] - set optimization level, i.e. -O2 by default

    • 0 - do not perform any optimization

    • 1 -

    • 2 - turns on intrinsic inlining and enables loop unrolling, register allocation, software pipelining, etc.

    • 3 - aggressive optimization for maximum speed, but no performance gain guaranteed.

  • -tpp[5,6,7] - optimize code for specific processor, i.e. option 7 for Pentium 4 and Xeon.

Parallelization

  • -parallel - autoparallelize code

  • -par_threshold[n] - determine how parallizable the loops are, i.e. n=0-100

  • -openmp - recognize OpenMP directives

  • -xW, -axW - enable the vectorizer

Profiling

  • -prof_gen - generate a profile on the code

  • -prof_use - use the generated profile information to provide the user with info

  • -par_report[0, 1, 2, 3] - provide a report about autoparallelization

  • -openmp_report[0, 1, 2] - provide a report about OpenMP directives

  • -vec_report[0, 1, 2, 3, 4, 5] - provide a report on vectorization