ARSC T3D Users' Newsletter 93, June 28, 1996

EPCC/MPI v1.5a Installed in Default Location

Our brief testing period for EPCC/MPI v1.5a has ended. We have installed it, replacing v1.4a. For EPCC's list of fixed bugs, enhancements, and known problems, please refer to last week's newsletter. For a workaround to the send/recv latency problem noted in that newsletter, refer to the next article.

EPCC/MPI v1.5a consists of the following files:


  /mpp/lib/libmpi.a
  /usr/include/mpp/mpi.h
  /usr/include/mpp/mpif.h
The directory "/mpp/lib" is in the default library search path of mppldr. You will have to tell the compiler where to find the include directories, however, using:

  -I/usr/include/mpp

Workaround: Transferring Short-Word Sized Data in EPCC/MPI v1.5a

As noted last week, using f90 and EPCC/MPI v1.5a, sends/receives of an odd number of elements from a REAL*4 array incur huge latencies. As EPCC technical support described it, due to an alignment problem, unassigned data was being copied into the receive buffer. So it turns out that the transfers were worse than slow -- they contained errors.

For clarification:


  ccc
         REAL   test (N)
         REAL*8 test8(N)
         REAL*4 test4(N)

         CALL MPI_SEND ( test (index), nElements, MPI_REAL , ... ) ! Okay
         CALL MPI_SEND ( test8(index), nElements, MPI_REAL8, ... ) ! Okay

         ! ERROR if "index" or "nElements" can be odd. 
         CALL MPI_SEND ( test4(index), nElements, MPI_REAL4, ... )
  ccc
The same problem apparently exists for other short types, and for short types in C. The first workaround, if you are not depending on the short types for memory efficiency (for instance), is to stick with the default REAL and MPI_REAL types.

If you want to use REAL*4 and MPI_REAL4, however, here is the workaround, from EPCC:


  > 
  > You can disable the medium-sized message transfer optimisation by
  > setting the value of environmental variable MPI_SM_TRANSFER to zero.
  > 
  >     export MPI_SM_TRANSFER=0  [or]  setenv MPI_SM_TRANSFER 0
  > 
  > This problem should only appear for short-word sized data
  > (MPI_INTEGER{1,2,4} and MPI_REAL4 in Fortran 90; MPI_SHORT and
  > MPI_FLOAT in C) transferred to/from a base address that is not
  > long-word aligned.
  > 
I tested this workaround using "ring.f" given in last week's Newsletter, as follows:

  denali$ TARGET=cray-t3d /mpp/bin/f90 -dp -X4 ring.f        \
    -I/usr/include/mpp -L/mpp/lib -lmpi

  denali$ MPI_SM_TRANSFER=0 a.out -npes 4
Here are some observations:
  1. It works as expected: transfers of odd-sized buffers are comparable to those of nearly equivalent even-sized buffers.
  2. MPI_SM_TRANSFER seems to be used at run-time only. You don't need to recompile your codes with this variable set, but you do need to set it before running.
  3. When you set MPI_SM_TRANSFER to 0, transfer rates may be slowed down regardless of the data type being transferred. (Actually, I only tested REAL*4 and REAL*8 -- but both suffered slow-downs.) Thus, it's probably a bad idea to set MPI_SM_TRANSFER=0 via your .kshrc or .cshrc file, and leave it set.

Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top