ARSC HPC Users' Newsletter 228, September 14, 2001



Mixed-Mode "Hello World"

This may be the world's most complex hello world....

(Well, I'm sure C++ programmers could do better :-)

Inspired by the talk by Dan Duffy (ERDC) at the ARSC IBM workshop last week, and by the installation at ARSC of a "distributed-shared memory" system (the IBM SP), I decided to write the simplest possible example of a mixed-mode program. This is an MPI program which then uses OpenMP within each MPI process.

Here's the program:

      program prog

      implicit none

      include       'mpif.h'
      integer       ierr
      integer my_pe,npes,iamt
      integer omp_get_thread_num

      call mpi_init( ierr )
      call mpi_comm_rank(MPI_COMM_WORLD, my_pe, ierr)
      call mpi_comm_size(MPI_COMM_WORLD, npes, ierr)

      write(6,*) ' program running on ',my_pe,' of ',npes

! create OpenMP threads

      call omp_set_num_threads(5)

!$omp parallel private(iamt)
      write(6,*) ' MPI ',my_pe,' omp ',iamt
!$omp end parallel

      call mpi_finalize(ierr)


This starts MPI and determines the number of processors MPI was told to use (from "mpirun -np X"), it then calls omp_set_num_threads to set the desired number of OpenMP threads on EACH MPI process, then each thread identifies itself to the world using its combination of MPI process number (my_pe) and OpenMP thread number (iamt). Note that each MPI process has its own set of threads, numbered 0-4.

This is compiled and run as follows.

On SGI shared memory systems:

  sgi> f90 -o prog prog.f -lmpi -mp
  sgi> mpirun -np 3 ./hello.mixed
    program running on  2  of  3
    program running on  1  of  3
    program running on  0  of  3
    MPI  2  omp  1
    MPI  2  omp  4
    MPI  2  omp  0
    MPI  2  omp  2
    MPI  2  omp  3
    MPI  1  omp  4
    MPI  1  omp  0
    MPI  1  omp  3
    MPI  1  omp  2
    MPI  1  omp  1
    MPI  0  omp  1
    MPI  0  omp  0
    MPI  0  omp  3
    MPI  0  omp  2
    MPI  0  omp  4
On the IBM SP:

  sp> mpxlf_r -qnosave -qsmp -o prog prog.f
We'll spare you the loadleveler script, but here's the result:

icehawk$ cat hello.mixed.ll.1137.0.out
   0:  program running on  0  of  3
   1:  program running on  1  of  3
   2:  program running on  2  of  3
   0:  MPI  0  omp  4
   0:  MPI  0  omp  1
   0:  MPI  0  omp  0
   0:  MPI  0  omp  2
   0:  MPI  0  omp  3
   1:  MPI  1  omp  0
   2:  MPI  2  omp  0
   1:  MPI  1  omp  4
   2:  MPI  2  omp  4
   2:  MPI  2  omp  3
   1:  MPI  1  omp  1
   2:  MPI  2  omp  2
   1:  MPI  1  omp  2
   1:  MPI  1  omp  3
   2:  MPI  2  omp  1

Why the complication of both MPI and OpenMP?

Most current and future MPP systems will have SMP nodes, and thus both distributed and shared memory characteristics.

Although straight MPI makes excellent use of these systems, mixed-mode programming can, in some cases, give better performance, greatly simplify the chore of parallel programming, and allow more processors to be applied to a problem. If you keep your ear to the grapevine (and this newsletter) you'll hear a lot about this. Dan Duffy's tutorial at SC2001 (see link, next article) would be a good place to get a thorough introduction.


Web Sites and Announcements


Use "find" to Help Stay Under Quota

If you're pushing your disk quota (on any Unix system), and you don't know why, the "find" command can locate all your large files quickly and easily. You can then delete, dmput, or move them, as necessary.

The following command will show all files over 500,000 bytes (you can, of course, change that threshold). First, "cd" to the directory from which you want to start the search. For instance, if you're worried about your /allsys quota,

  cd /allsys/$HOME

  find . -size +500000c -print
The "find" command appears in several past Quick-Tips. The "Quick-Tip" index will take you right there:



Quick-Tip Q & A

A:[[ Is RESHAPE broken on every system?  Here's what it's supposed to 
  [[    do, from my documentation:
  [[    "The RESHAPE intrinsic function constructs an array of a 
  [[         specified shape from the elements of a given array."
  [[    Here's my test program:
  [[ !------------------------------------------------------------------
  [[       program shape_change
  [[       implicit none
  [[       real, dimension (30,40) :: X
  [[       X = 1
  [[       print*, "Old shape: ", shape (X)
  [[       X = reshape (X, (/ 20,60 /), (/ 0.0 /) )
  [[       print*, "New shape: ", shape (X)
  [[       end
  [[ !------------------------------------------------------------------
  [[     And here's the result from an SP (Crays and SGIs give exactly 
  [[     the same result):
  [[ ICEHAWK1$ xlf90 shape.f
  [[ ** shape_change   === End of Compilation 1 ===
  [[ 1501-510  Compilation successful for file shape.f.
  [[ ICEHAWK1$ ./a.out      
  [[  Old shape:  30 40
  [[  New shape:  30 40

# Thanks to Evelyn Price, James Long, and Olivier Golinelli.
# Evelyn's brief answer is encapsulated in point #4 of the 
# response, below.  This is the complete response from Olivier:

1] As 20*60 = 30*40, the third argument of the reshape is useless.  As it
   is optional, it is simpler to write :

      X = Reshape (X, (/ 20,60 /))

2] The right hand side of this line is a array of shape (20,60).

3] The left hand side is a array of shape (30,40), as declared before.

4] The equal sign is a "array assignment" : it does not change the shape of
   the l.h.s array.  On the contrary, the both arrays MUST have the same
   shape.  Thus, this program is not correct and the result is unpredictable.

5] As all the shapes are known at the compile time, the compiler should
   detect the problem and abort. For example, the Sun compiler says:

     ERROR: The left and right hand sides of this array syntax assignment 
     must be conformable arrays.

6] The major improvement of Fortran 90 is, in principle, to allow a better
   detection by the compiler of "small" errors.  For my point of view, it
   is very regrettable that customers accept software products that
   work approximately.

7] A correct (but stupid because of the duplication of the data) program is :

     real :: X(30,40), Y(20,60)
     X = 1
     Y = Reshape (X, Shape(Y))

# Here is James' reply:

A: Reshape works fine:

        program shape_change
        implicit none
        real, dimension (3,4) :: X
        X = 1
        print*, "Old shape: ", shape (X)
        X = reshape (X, (/ 2,6 /), (/ 0.0 /) )
        print*, "New shape: ", shape (reshape (X, (/ 2,6 /), (/ 0.0 /) ))
  sgi> f90 blah.f90
  sgi> ./a.out
   Old shape:  3,  4
   New shape:  2,  6

The assignment statement of X to reshape should be caught by the
compiler, since the two sides are non-conformable. Interestingly, it is
caught (on the SGIs and Crays, but not the IBM) if the padding option is

        program shape_change
        implicit none
        real, dimension (3,4) :: X
        X = 1
        print*, "Old shape: ", shape (X)
        X = reshape (X, (/ 2,6 /) )
        print*, "New shape: ", shape (reshape (X, (/ 2,6 /), (/ 0.0 /) ))
  sgi> f90 blah.f90
        X = reshape (X, (/ 2,6 /) )
  f90-253 f90: ERROR SHAPE_CHANGE, File = blah.f90, Line = 7, Column = 9 
    The left and right hand sides of this array syntax assignment must be
  conformable arrays.

Q: I use C, C++, and Fortran 90/95. I need to generate file names
   which must be unique, to be used for temporary, or scratch files.
   Any suggestions on how to do this?

[[ Answers, Questions, and Tips Graciously Accepted ]]

Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top