Performance Tools

The following are a few of the tools available on Blue Waters.

What to get

First, get a job:
qsub -I -l nodes=1:ppn=32:xe -l walltime=01:00:00,advres=bwintern

Open another terminal window. In that window, copy this code into your scratch directory:
cp -r ~instr006/perf ~/scratch/

The gprof utility

The GNU profiler, gprof, shows you which parts of your program are taking much execution time. Since gprof uses data gathered during the running of your program, how you run your program will affect the information in the profile data. If you don't use some feature of your program while it is running, no profile information will be generated for that.

To profile your code is to have the compiler add a tiny write command in each function so that when the program is run, each function's calling function is recorded. Also, every 10ms or so, the instruction that the program counter is pointing to is recorded. This data can be then be analyzed. To profile code with the GNU C Compiler (gcc), use the -pg flag upon compilation.

The following is how we will examine two versions of the fire model:

  1. cd ~/scratch/perf/verbose-fire
  2. vi fire.c Let's look at the code to form some expectations
  3. Type :q! to exit Vim
  4. module swap PrgEnv-cray PrgEnv-gnu
  5. export OMP_NUM_THREADS=4
  6. cc -O0 -pg -g -DDEBUG -c -fPIC fire.c -o openmp_fire.o
  7. cc -O0 -pg -g -DDEBUG -I. -L. openmp_fire.o main.c -o openmp.o
  8. Ensure you're on a compute compute node to run this
  9. if you're not, go to a compute node and type: cd ~/scratch/verbose-fire
  10. aprun -n 1 -d 4 ./openmp.o -p 1 -n 80
  11. gprof --line --flat-profile openmp.o gmon.out this takes several seconds
  12. Pressing 'q' escapes this.
  13. gprof openmp.o gmon.out
And now,
  1. cd ~/scratch/perf/quiet-fire
  2. module swap PrgEnv-cray PrgEnv-gnu
  3. export OMP_NUM_THREADS=4
  4. cc -O0 -pg -g -DDEBUG -c -fPIC fire.c -o openmp_fire.o
  5. cc -O0 -pg -g -DDEBUG -I. -L. openmp_fire.o main.c -o openmp.o
  6. Ensure you're on a compute compute node to run this
  7. if you're not, go to a compute node and type: cd ~/scratch/quiet-fire
  8. aprun -n 1 -d 4 ./openmp.o -p 1 -n 80
  9. gprof --line --flat-profile openmp.o gmon.out this takes several seconds
  10. Pressing 'q' escapes this.
  11. gprof openmp.o gmon.out
Take 10-15 minutes to modify the quiet version of fire.c, perhaps removing the OpenMP pragmas or uncommenting some of the print statements. Then repeat the above steps. What do you expect will be the outcome?

CrayPat

CrayPat is a data capture tool for Cray systems. The generated report can be used to prepare your program for performance analysis experiments, to specify the kind of data to be captured during program execution, and to prepare the captured data for text reports or for use with other programs.

These steps you need to try this out for yourselves. The GNU Programming environment doesn't support these Cray tools; by default, your account places you in the Cray environment, but we changed that a few minutes ago.

  1. module swap PrgEnv-gnu PrgEnv-cray
    since we were just in the GNU Programming Environment
  2. module unload darshan; module load perftools
    unload darshan because it can cause conflicts; perftools is needed
  3. cc -o pi_integration_mpi pi_integration_mpi.c
    compile after loading perftools
  4. pat_build -S pi_integration_mpi
    build the "instrument binary" with the object file (note the "+pat" extension)
  5. aprun -n 2 ./pi_integration_mpi+pat
    run the "instrument binary" rather than the object file, this generates a data file with a unique name
  6. pat_report pi_integration_mpi+pat+8027-3249s.xf
    review that data file, but your .xf file should have different numbers in the name

Resources

  1. Blue Waters' page explaining several tools on the system
  2. To discover the available hardware on your compute node, use PAPI
  3. GProf
  4. Lecture explaining the use of CrayPat (performance guidelines at 1:17:15)