ARSC HPC Users' Newsletter 303, November 05, 2004

Using Totalview with LoadLeveler on iceberg

The totalview debugger is available on both IBM systems at ARSC. This article focuses on using totalview with LoadLeveler on iceberg, ARSC's p655+/p690+ cluster. The use on iceflyer is slightly different and is described in a news item on that system ("news totalview").

The front-end nodes on iceberg are suitable for debugging serial and small parallel jobs, however sometimes it is necessary to do debugging work on a larger scale. The "debug" queue is available on iceberg for this purpose, allowing jobs with up to 32 MPI tasks to be run. To do this, you use an interactive LoadLeveler queue to run totalview.

Here's how:

I) Connect to iceberg through an X Window System interface (e.g., using ssh from an xterm). The totalview GUI uses X, although "totalviewcli" (for "command line interface") is also supported.

II) Compile your application with debugging information. The -g compiler flag will add this information. For example:

  iceberg1 1% mpxlf90_r -g mpi-hello.f -o mpi-hello
  ** mpi_hello   === End of Compilation 1 ===
  1501-510  Compilation successful for file mpi-hello.f.

III) Next, create an appropriate LoadLeveler script, specifying the "debug" class . Here are examples for both MPI and OpenMP applications.

MPI Example: 16 tasks using 2 nodes:

  iceberg1 2% cat mpi.totalview.ll
  # @ job_type         = parallel
  # @ node             = 2
  # @ tasks_per_node   = 8
  # @ network.MPI      = sn_single,shared,us
  # @ class            = debug
  # @ wall_clock_limit = 0:30:00
  # @ queue

  #Note: no executable needs to be specified
  #      in this LoadLeveler script.

OpenMP Example: 8 threads:

  iceberg1 3% cat omp.totalview.ll
  # @ job_type         = parallel
  # @ node             = 1
  # @ class            = debug
  # @ wall_clock_limit = 0:30:00
  # @ queue

  #Note: no executable needs to be specified

These LoadLeveler scripts are a little unusual in that they only request resources, and don't actually run executables. The executable will be specified by poe in the next step.

IV) Finally, start totalview. Totalview runs poe which launches the LoadLeveler script and runs the executable. The totalview option, "-a," specifies that the subsequent string is passed to the executable (which in this case, is poe). The name of the LoadLeveler file can be passed to poe using either the flag -llfile or the environment variable MP_LLFILE.

Using the poe flag -llfile:

  iceberg1 4% totalview poe -a ./mpi_hello -llfile /wrkdir/username/mpi.totalview.ll
Using the poe environment variable MP_LLFILE:

  iceberg1%  export MP_LLFILE=/wrkdir/username/mpi.totalview.ll
  iceberg1%  totalview poe -a ./mpi_hello
The usage is essentially identical for OpenMP jobs, for instance:

  iceberg1%  export OMP_NUM_THREADS=8
  iceberg1%  totalview poe -a ./omp_hello -llfile /wrkdir/username/omp.totalview.ll

A few things to note:

  1. Any environment variable settings made in the LoadLeveler script will be ignored. If your executable uses environment variables, either set them on the command line or through the totalview menu: Process | Startup Parameters | Environment .

  2. Command line arguments to your program can be either passed in the totalview command line or using the totalview menu: (i.e. Process | Startup Parameters | Arguments ). Totalview uses the flag -a to pass any subsequent arguments to the executable rather than totalview. The poe executable will in turn pass any arguments it doesn't understand along to your executable.

    In this example:     totalview poe -a ./mpi_hello -llfile ./mpi.totalview.ll arg1 arg2 arg3 Totalview will pass the arguments,     './mpi_hello -llfile ./mpi.totalview.ll arg1 arg2 arg3' to poe and poe in turn will pass the arguments,     'arg1 arg2 arg3' to mpi_hello.

  3. Interactive LoadLeveler jobs do not wait in the queue for the requested resources to become available. If there are insufficient resources, your job will immediately fail, and display an error.

  4. The debug nodes on iceberg have specific Unix services enabled which allow totalview to run properly. These are the only iceberg nodes on which these services are available.

  5. You may use "llmap" to keep an eye on the debug nodes. Currently these are b7n1, b7n2, b7n3, and b7n4.

  6. Your LoadLeveler job must use the debug class. It's the only class which runs jobs on the debug nodes. Jobs in this class are limited to the 4 nodes (32 processors) and 30 minutes.

X1: Accessing Open Source Core Unix Utilities

A nice feature of the Cray user environment is the "module" command. "Modules" allow multiple versions of utilities, compilers, libraries, and other tools to co-exist, and give each user a trivial method for independently selecting an individual combination. ("Modules," by the way, is open source, available through SourceForge. See: .)

Among the classes of software supported through "modules" is included "core utilities," such as "ls" and "cp." Two versions of core utilities are available: Cray native and open source. Thus, you can use GNU ls, and get nifty options, like the ability to specify your own block size, e.g.:      ls --block-size=1000000 *.dat or you can stick standard "ls" which, of course, provides all the standard options.

As announced in news/MOTD on klondike, Cray's organization of open source utilities was recently changed. Core utilities are now split into a separate module, rather than being lumped in with the higher-level open source routines. Here are the two modules you need to know about:

  1. The module, coreutils contains the open source version of about 100 basic Unix commands, such as:

          cat       chmod
          head      tail
          ls        sort

    "coreutils" is not loaded into your environment by default. To load it, execute the command:     module load coreutils To unload it:     module unload coreutils

    Unless "coreutils" is loaded, you'll get the standard native Cray implementation, which will work just fine for almost everything.

  2. The module, open , contains about 150 open source Unix utilities, such as:

          python    make
          gzip      perl
          expect    zcat

    It is loaded by default (unless you have modified your originally provided .login or .profile file).

To see what you're using, type the command:

module list

Note that both "open" and "coreutils" prepend the paths to their utilities to your PATH environment variable. Thus, for ALL utilities which have both open source and native versions (like "make," "diff," etc...), you'll get the open source version.

If you want only one specific utility from the open source module, and not everything, then don't load "open" or "coreutils." Instead, specify the complete path to the specific utility. For instance, if you wanted the open source versions of make and ls, you'd execute make and ls commands as follows: "/opt/open/open/bin/make" and "/opt/open/open/coreutils/bin/ls" .

(If this is confusing, don't hesitate to contact ARSC Consulting for help:


Here's an example session using these modules... Note items 13 and 14 in the output from "module list":

  %     module list                    
  Currently Loaded Modulefiles:
    1) modules     5) CC          9) cal        
13) open

    2) PrgEnv      6) mpt        10) totalview  
14) coreutils

    3) craylibs    7) cftn       11) X11
    4) libsci      8) craytools  12) pbs
  %     man ls   
### man page shows "coreutils"

       LS(1)      ls (coreutils) 5.0.90 (July 2003)      LS(1)
            ls - list directory contents
  %     ls -l a.out driver        
### Use "ls" from coreutils

  -rwx------    1 username   mygrp     6564000 Nov  4 16:43 a.out
  -rwx------    1 username   mygrp     6568240 Nov  4 16:54 driver
  %     ls --block-size=1000000 -l a.out driver
  -rwx------    1 username   mygrp           7 Nov  4 16:43 a.out
  -rwx------    1 username   mygrp           7 Nov  4 16:54 driver
  %     module unload coreutils   
### Switch to native core utilities

  %     module list
  Currently Loaded Modulefiles:
    1) modules     5) CC          9) cal        
13) open

    2) PrgEnv      6) mpt        10) totalview
    3) craylibs    7) cftn       11) X11
    4) libsci      8) craytools  12) pbs
  %     man ls   
### man page doesn't show "coreutils"

  ls(1)                                   Last changed: 09-10-2003
       ls -- Lists contents of directory
### Native version doesn't understand "--block-size" option

  %     ls --block-size=1000000 -l a.out driver  
  UX:ls: ERROR: Illegal option -- k
  UX:ls: ERROR: Illegal option -- z
  UX:ls: ERROR: Illegal option -- e
  UX:ls: ERROR: Illegal option -- =
  UX:ls: ERROR: Illegal option -- 0
  UX:ls: ERROR: Illegal option -- 0
  UX:ls: ERROR: Illegal option -- 0
  UX:ls: ERROR: Illegal option -- 0
  UX:ls: ERROR: Illegal option -- 0
  UX:ls: ERROR: Illegal option -- 0
  UX:ls: TO FIX: Usage: ls -RadC1xmnloghrtucpFbqisfLAHMDPS [files]
### Correct usage of native "ls"

  %     ls -l a.out driver                     
  -rwx------    1 username   mygrp     6564000 Nov  4 16:43 a.out
  -rwx------    1 username   mygrp     6568240 Nov  4 16:54 driver

Fall Training, 16 Nov: IDL

The next ARSC course in our Fall Training series:

Title: Using IDL to Visualize Scientific Data Instructor: Sergei Maurits, ARSC Date: November 16 Location: WRRB 009

Complete schedule:

Loupe Performance of the Cray X1: Part III

[ Thanks to Lee Higbie for another chapter, the conclusion. ]

Sometimes no action is the best action. -- Kershner's Law

But not when you need an answer, sooner than anyone believes possible.

"Well, Bulwer," Snoopy's boss asked him, using that name he hated, "what have you got for me on those loops." Snoopy groaned and looked pained, and his boss rubbed it in. "Do you prefer Bulwer or Bullwer?"

"Just call me 'Snoopy,'" he requested darkly. He thought about storming out of the confrontation, but decided to answer. "We know that it's delicate matter to predict Klondike's performance from the compiler's loopmark listing of vectorization and multistreaming. It's like trying to use traffic lights to predict speed. Without green lights, you know it'll be slow, but green lights don't make cars move fast." Snoopy was inspired by coincidence, the sort of coincidence that detectives seek. "Do you realize that the hotter the weather in Melbourne and Sidney, the slower the traffic in Fairbanks? It's a fact. Check it out. I've got the data if you really doubt me."

Snoopy got one of those stares, not the kindly one that says, "What in the Seymour Hill are you talking about?" but the one that says, "I think it may be time to have you committed." He decided he needed reinforcements and explained that ARSC provides consultants and Vector Specialists. They understand performance analysis tools, and can even spell PAT. They know to look at program characteristics that affect memory usage in general and cache efficiency in particular, and if they get stuck... they have friends on the inside, at Cray.

To his list of Klondike quickening criteria from the last two newsletters, Snoopy added some more:

  • Structure code so that the innermost loops vary subscripts most rapidly.
    • Fortran innermost loops iterate on the first subscript
    • C innermost loops iterate on the last subscript
  • Avoid dimensioning arrays with multiples of 16. Just changing the first subscript of a Fortran array from 128 to 129 can help speed programs he found.
  • Sometimes simple changes of compilation options or environment variables will increase speed substantially.

"The real key to a complex case like this," Snoopy said, "is to know the tools in your toolbox, and take advantage of the best geekologists you can afford." Snoopy pulled down his USMC helmet a little and left, confident that the case was closed.

See you at SC04!!

SC04, the major conference for those in high-performance computing, is next week (Nov. 8-12), in Pittsburgh.

Be sure to visit the ARSC booth...

ARSC staff, and management will be available to tell you ARSC's story, hear yours, and look for ways we can collaborate.

Don Bahls, the new co-editor of this newsletter, will be there if you'd like to hassle him, or, better yet, offer ideas for Quick-Tips, articles, or anything else.

Quick-Tip Q & A

A:[[ Almost every program I compile these days requires some
  [[ pre-processing.
  [[ The IBM XLF compiler runs the pre-processor if the source file has
  [[ the suffix ".F".  For files with other typical Fortran suffixes, is
  [[ there any way, other than renaming the file, to get XLF to run the
  [[ pre-processor?

# Many thanks to Ed Anderson: 

The way to tell the IBM Fortran compiler to look for suffixes other than
the defaults (.f for Fortran compilation only, .F for the
preprocessor+Fortran) is using the compiler option:
The options to -qsuffix are

  f=suffix    : New source file suffix
  o=suffix    : New object file suffix
  s=suffix    : New assembler source file suffix
  cpp=suffix  : New preprocessor source file suffix

You can specify more than one of these settings by separating them with
colons.  The manual gives an example of redefining the source file
suffix to .f90 and the preprocessor source file suffix to .F90:

   xlf -qsuffix=f=f90:cpp=F90 -c foo.F90

If you're compiling using make, not on the command line, you will also
have to tell make about the new suffixes and provide rules for them, for

FFLAGS  = -qsuffix=f=f90:cpp=F90

.SUFFIXES: .f90 .F90

libfoo.a: a.o b.o
        ar cr libfoo.a a.o b.o

        ${FORTRAN} ${FFLAGS} -c $<

        ${FORTRAN} ${FFLAGS} -c $<

Q: I'm looking for an easy, automatic way of scanning through a list of
   files to change all lines like this:
     #include "some_string"
   to this:
     #include <some_string>

[[ Answers, Questions, and Tips Graciously Accepted ]]

Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top