ARSC HPC Users' Newsletter 329, November 18, 2005

Introduction to PMPI Part II

Part I of this series introduced the Profiled MPI (PMPI) interface and gave a trivial example. This article shows how to use PMPI to produce a custom debugging version of a standard MPI routines and motivates it with a real example.

My need for PMPI arose as I attempted to find an MPI_RECV error in a large user code. The IBM MPI libraries were exiting in a way which made it difficult to locate the error using a debugger.

Here's a sample code which reproduces the particular bug, an invalid rank value in an MPI_RECV call:


iceberg2 1% cat bad-send-recv.f90
PROGRAM MPI_HELLO
    use mpi
    INTEGER ierr
    INTEGER task,total_tasks
    INTEGER stat(MPI_STATUS_SIZE)
    REAL IN(10)
    REAL BUF(10)
    INTEGER II
    
    CALL MPI_INIT(ierr)
    CALL MPI_COMM_RANK(MPI_COMM_WORLD,task,ierr)
    CALL MPI_COMM_SIZE(MPI_COMM_WORLD,total_tasks,ierr)
    
    DO II=1, 10
        IN(II)=task
    END DO
    CALL MPI_BARRIER(MPI_COMM_WORLD)
    
    IF ( mod(task,2) == 0 ) then
        CALL MPI_SEND(buf, 10, MPI_REAL, task+1, 99, MPI_COMM_WORLD, ierr)
    ELSE
        CALL MPI_RECV(buf, 10, MPI_REAL, task+1, 99, MPI_COMM_WORLD, stat, ierr)
        ! This program has an error in the MPI_RECV call above. 
        !  task+1 should be task-1
    END IF
    
    CALL MPI_FINALIZE(ierr)

END PROGRAM

When this code is run on iceberg, we do get an MPI_RECV error in the stderr output from the run:


iceberg2 2% ./bad-send-recv -hostfile ./hosts -procs 4
   3:ERROR: 0032-101 Invalid source rank  (4) in MPI_Recv, task 3
ERROR: 0031-250  task 3: Terminated
ERROR: 0031-250  task 0: Terminated
ERROR: 0031-250  task 2: Terminated
ERROR: 0031-250  task 1: Terminated

In this case, it's easy to see where the error occurred because there is only one MPI_RECV call. However in the case of the real code such information isn't sufficient to locate the bug. The error condition doesn't seem to produce a signal, so it can't be easily be trapped by a debugger, either.

Using PMPI we can alter the MPI_RECV call to watch for invalid ranks and to produce a signal if such an error occurs.

Below is a debugging implementation of MPI_RECV which checks the bounds of the input task to ensure it's within the valid range for the communicator. If rank is invalid, the MPI_RECV routine issues an ABORT() call which results in a core dump.


iceberg1 3% cat prof_mpi.f90
subroutine MPI_RECV(buf, count, typ, source, tag, comm, status, err)
    use mpi
    IMPLICIT NONE
    INTEGER, INTENT(OUT) :: buf
    INTEGER, INTENT(IN) :: count, typ, source, tag, comm
    INTEGER, INTENT(OUT) :: status(*)
    INTEGER, INTENT(OUT) :: err

    INTEGER :: total_tasks, ierr

    CALL BOUNDS_CHECK("MPI_RECV ", source, comm) 
    CALL PMPI_RECV(buf, count, typ, source, tag, comm, status, err)

end subroutine


subroutine BOUNDS_CHECK(func_name, rank, comm)
!   Aborts if the rank is outside the bounds of the domain
!
    IMPLICIT NONE
    CHARACTER(LEN=*), INTENT(IN) :: func_name 
    CHARACTER(1024) :: clean_func
    INTEGER, INTENT(IN):: rank
    INTEGER, INTENT(IN):: comm

    INTEGER :: total_tasks
    INTEGER :: ierr

    CALL PMPI_COMM_SIZE(comm, total_tasks, ierr)
    IF ( ( rank > ( total_tasks - 1 ) ) .OR. ( rank < 0 ) ) then
        PRINT *, TRIM(func_name) , " Error: rank out of bounds; rank=", rank, " total tasks=", total_tasks 
        CALL ABORT()
    endif
end subroutine

Below is a compile statement using the IBM compilers.


iceberg1 4% mpxlf90_r -qsuffix=f=f90 prof_mpi.f90 -c
** mpi_recv   === End of Compilation 1 ===
** bounds_check   === End of Compilation 2 ===
1501-510  Compilation successful for file prof_mpi.f90.

We can then compile the original MPI program (without modifying it!!) and link against the debug version of MPI_RECV in prof_mpi.o. The -g and -qfullpath flags are included to ensure that the core file has useful information.


  iceberg2 5% mpxlf90_r -qsuffix=f=f90 -qfullpath -g prof_mpi.o bad-send-recv.f90 -o bad-send-recv
  ** mpi_hello   === End of Compilation 1 ===
  1501-510  Compilation successful for file bad-send-recv.f90.

When the debug version of the executable is run with output labeled by task and IBM's lightweight core files enabled we quickly track down the error to line 22 of bad-send-recv.f90, and have a record of the input rank and task which caused the abort.


iceberg2 6% export MP_LABELIO=yes
iceberg2 7% export MP_COREFILE_FORMAT=STDERR
iceberg2 24% ./bad-send-recv -hostfile ./hosts -procs 4
   3: MPI_RECV Error: rank out of bounds; rank= 4  total tasks= 4
   3:+++PARALLEL TOOLS CONSORTIUM LIGHTWEIGHT COREFILE FORMAT version 1.0
   3:+++LCB 1.0 Thu Nov 10 14:18:24 2005 Generated by IBM AIX 5.2
   3:#
   3:+++ID Node 3 Process 725000 Thread 1
   3:***FAULT "SIGABRT - Abort"
   3:+++STACK
   3:# At location 0xd2d6b188 but procedure information unavailable.
   3:bounds_check : 0x000002b0
   3:mpi_recv : 0x00000054
   3:mpi_hello : 22 # in file </gpfsa/wrkdir/bahls/news_pmpi/pmpi/bad-send-recv.f90>
...
...
...
ERROR: 0031-250  task 3: IOT/Abort trap
ERROR: 0031-250  task 2: Terminated
ERROR: 0031-250  task 0: Terminated
ERROR: 0031-250  task 1: Terminated

Gaussian Multi-Threaded on X1

Gaussian on the X1 is an OpenMP application, compiled in SSP-mode. The default number of OpenMP threads is 1, but you can configure your run scripts to use up to 16.

We encourage you to test your input deck on 4, 8, 12, and 16 threads, and if the speedup curve tapers off, do your production runs on fewer rather than more threads.

For more info, see:

http://www.arsc.edu/support/howtos/usingg03.html

Santa Letters, Postmarked "North Pole"

Hard to believe, but it's that time of year again...

Uncles, Aunts, Parents, Teachers, and other friends of kids: here's an offer from the ARSC HPC Newsletter editors.

The town of North Pole, Alaska is a mere 15 miles from here. If you'd like a letter, postmarked "North Pole," delivered to someone, seal your letter in a stamped, addressed envelope, and instead of mailing it from your local post office, enclose it in a larger envelope and send this to us. On about December 12th we'll mail them from North Pole.

Send to:

Tom Baring and Don Bahls Arctic Region Supercomputing Center University of Alaska Fairbanks P.O. Box 756020 Fairbanks AK 99775-6020

Plan extra time for mail to/from Alaska... If you post these to us on or before Dec. 5th, there should be no problem.

Quick-Tip Q & A



A:[[ I use a calculator for evaluating arithmetic expressions.  It accepts
  [[ an arithmetic expression as stdin and outputs the answer on stdout.
  [[ E.g.,
  [[     cl 1024*1024*1024*15
  [[      = 16106127360
  [[
  [[ My only complaint with this "calculator" is that I have a hard time
  [[ reading long-digit numbers.
  [[ 
  [[ What I would like is a filter that accepts such a number on stdin, and
  [[ writes to stdout the number formatted with commas separating the
  [[ thousands places.  E.g.,
  [[     cl 1024*1024*1024*15 
 comma_adder
  [[      = 16,106,127,360



#
# Thanks to Jim Long: 
#

My brain-dead shell script without look-ahead capability:

#!/bin/sh
######################################################################
# add commas to an integer on either stdin or its first argument

if [[ -z $1 ]]
then
  read num
  num=`echo $num 
 sed "s/\([0-9].*\)\([0-9][0-9][0-9]$\)/\1,\2/"`
else
  num=`echo $1   
 sed "s/\([0-9].*\)\([0-9][0-9][0-9]$\)/\1,\2/"`
fi

filter=[0-9][0-9][0-9],

while true
do
  num_sav=$num
  num=`echo $num 
 sed "s/\([0-9].*\)\($filter\)/\1,\2/"`
  filter=[0-9][0-9][0-9],$filter
  if [ $num = $num_sav ]; then break; fi
done

echo $num
######################################################################

To keep it cleaner, I left out the regex for a decimal number.

With look-ahead capability things are even cleaner, if a bit more
cryptic:  A little research on the web
(http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/78097)
shows:

  % ruby -e 'puts "456778904".gsub(/(.)(?=.{3}+$)/, %q(\1,))'
      456,778,904

And from the book "Mastering Regular Expressions" by Jeffrey E.F. Friedl
(pg. 291-2):

  1 while s/^(-?\d+)(\d{3})/$1,$2/;

which can be made faster with a single /g substitution

  s<(\d{1,3})(?=(?:\d\d\d)+(?!\d))><$1,>g;

Hopefully you guys aren't going to ask about how to remove 
/* c comments... */



# 
# From one of the editors: 
#

#!/usr/bin/perl

while(<STDIN>)
    {
    $tok=$_;
    while ( $tok =~ /\b(\d+)(\.?\d*)\b/g )
        {
        $num=$1;
        $dec=$2;
        while ( $num=~ /(.*\d)(\d\d\d)(\b.*)/ )
            {
            $num="$1,$2$3";
            }
        print "$num$dec ";
        }
    print "\n";
    }


Q: Here's a challenge for vi and vim experts: vi/vim have yank and put.
   I want a new operation, "replace."

   Here's some text:

     a b xxxx yyyyy c d e f wwwwwwwwww zz vvvvvvvvv g h i 

   The goal, for example, would be to replace "wwwwwwwwww zz vvvvvvvvv"
   with "xxxx yyyyy" as follows:

     1- move vi cursor to first x
     2- yank "xxxx yyyyy" to the vi buffer with the vi command: y2w
     3- move vi cursor to first w
     4- overwrite "wwwwwwwwww zz vvvvvvvvv" with contents of the buffer 

   Can you devise a vi map (or use some other clever system) to perform
   step 4 with one command? If this requires a companion map for step 2,
   using named buffers, perhaps, that would be fine.

   All solutions welcome as long as they don't use the mouse.  Ideally
   the mechanism would be quite general so you could yank an arbitrary
   region and use it to replace any other arbitrary region.

[[ Answers, Questions, and Tips Graciously Accepted ]]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top