ARSC T3E Users' Newsletter 124, August 22, 1997

MPI Communicators

Most people accept life's minor annoyances without too much worry:


  + We dial "9" for an outside line.
  + We submit IRS form 8606.
  + We go on and off "Daylight Saving" time year after year--even 
    in Fairbanks, "Land Of The Noon Sunrise."
  + And we type "MPI_COMM_WORLD" over and over in every MPI
    program we ever write.

While the editors of the T3E Newsletter are as blindly obedient of the first three "rules" on this list as anyone, perhaps we can offer a bit of satisfaction on the fourth.

In MPI, a communicator specifies a communication domain, or set of processors, which take part in a communication event. MPI_COMM_WORLD specifies all processes, and is the default MPI communicator. It is the only communicator many MPI programs will ever need. Sometimes, however, it is a significant aid to program clarity to be able to define a non-standard communicator as a subset of the available processes. You shouldn't invent communicators on principle, as you might go bungie jumping, but do keep them in your toolkit (just in case).

An Example Program:

The program below is an example of how to create user defined communicators and use them. This example stems from an ARSC user who has a simple Monte Carlo computation but works with large datasets that approach the maximum memory available on a single processor.

The approach taken was to create two master processes, one for input, the other for output. This results in separate and independent reads and writes of data to the filesystem and worker processes. If only a single communicator were used, there would need to be numerous synchronisation events between the two masters. By defining a separate communicator for each master and the workers, coding is greatly simplified.

The code creates two communicators starting from all processes as defined in MPI_COMM_WORLD. These communicators are called WG1 and WG2. Each communicator contains all worker processes and one of the two masters, defined as master_in and master_out.

In the subsequent logic, we had to ensure that all processes in each of the two sets followed the same sequence. The code simply doesn't call MPI_BARRIER on the process not in the set.


cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
      program prog_communicator

      implicit none

      include 'mpif.h'

! max num of processes
      integer, parameter:: max_pes=128

! information for local and global conditions.
      integer my_pe, totpes
! general mpi error flag
      integer ierr

! who is to be incharge
      integer minproc,moutproc

! RANKS
      integer RANKS(max_pes)
! groups and new communicators
      integer  world_group, gin, WG1, gout, WG2,  world

! sizes and RANKS
      integer WG1SIZE, WG1RANK, WG2SIZE, WG2RANK

! loops over process
      integer ip,ips


! setup mpi
      call MPI_INIT(ierr)

! find my identity
      call MPI_COMM_RANK(MPI_COMM_WORLD, my_pe, ierr)
! is there a universe out there?
      call MPI_COMM_SIZE(MPI_COMM_WORLD, totpes, ierr)

      write(6,*) ' my_pe is ',my_pe,' of total ',totpes


! set master/controller process for activity below
      minproc=0
      moutproc=totpes-1


      write(6,*) '  masters are ',minproc,moutproc

! set table to build group
      ips=2
      do ip=0, totpes-1
        if(ip.ne.minproc.and.ip.ne.moutproc) then
          RANKS(ips)=ip
          ips=ips+1
        endif
      enddo
! set first processor in new group to be desired master.
      RANKS(1)=minproc

      write(6,*) ' on ',my_pe,' rank ',RANKS(1:totpes)
      call MPI_BARRIER(MPI_COMM_WORLD,ierr)

      call MPI_COMM_GROUP(MPI_COMM_WORLD, world_group, ierr)

      call MPI_GROUP_INCL(world_group, totpes-1, RANKS, gin,
     @      ierr)

      call MPI_COMM_CREATE(MPI_COMM_WORLD, gin, WG1,
     @      ierr)

      call MPI_GROUP_FREE(gin, ierr)

! maintain same worker order but change master.
      RANKS(1)=moutproc

      write(6,*) ' on ',my_pe,' rank2 ',RANKS(1:totpes)
      call MPI_GROUP_INCL(world_group, totpes-1, RANKS, gout,
     @      ierr)

      call MPI_COMM_CREATE(MPI_COMM_WORLD, gout, WG2,
     @      ierr)

      call MPI_GROUP_FREE(gout, ierr)
      call MPI_GROUP_FREE(world_group, ierr)



!! enquire about new communicators

      WG1SIZE=-1; WG1RANK=-1; WG2SIZE=-1; WG2RANK=-1

      if(my_pe.ne.moutproc) then
        call MPI_COMM_SIZE(WG1, WG1SIZE, ierr)
        call MPI_COMM_RANK(WG1, WG1RANK, ierr)
      endif

      if(my_pe.ne.minproc) then
        call MPI_COMM_SIZE(WG2, WG2SIZE, ierr)
        call MPI_COMM_RANK(WG2, WG2RANK, ierr)
      endif

      write(6,*) ' sizes are ',my_pe,totpes,' new WG1 ',
     @      WG1RANK,WG1SIZE,' new WG2 ',WG2RANK,WG2SIZE

!! call global barrier
      call MPI_BARRIER(MPI_COMM_WORLD,ierr)
      write(6,*) ' on ',my_pe,' of ',totpes,' barrier '

! only sync on WG1
      if(my_pe.ne.moutproc) then
        call MPI_BARRIER(WG1,ierr)
        write(6,*) ' on ',my_pe,' of ',totpes,' barrier WG1 '
      endif

! only sync on WG2
      if(my_pe.ne.minproc) then
        call MPI_BARRIER(WG2,ierr)
        write(6,*) ' on ',my_pe,' of ',totpes,' barrier WG2 '
      endif

      call MPI_FINALIZE(ierr)

      end
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc

First, a list of all processes to be in the communicator is created in RANKS. Note that the order of the processes in listed in RANKS are renumbered from 0 to N, this is why the first entry in RANKS is the master process for this set. In this case for each group the master is process 0. Also the workers are entered in RANKS in the same order so that for each communicator worker N in WG1 is also worker N in WG2.

Now a call to MPI_COMM_GROUP extracts a process group from MPI_COMM_WORLD communicator and identifies this as world_group. A call to MPI_GROUP_INCL uses the WORLD_GROUP process group and includes the TOTPES-1 processes listed in RANKS and identifies this as GIN. (There is a similar routine called MPI_GROUP_EXCL which excludes those in RANKS along with a number of different routines to process and manipulate process groups. Users are directed to the MPI books listed in the ARSC reading list in edition 121 for a complete list.) The communicator WG1 is now created by a call to MPI_COMM_CREATE. The process group GIN is now released by a call to MPI_GROUP_FREE.

The above steps are now simply repeated to create another communicator as necessary. After work with WORLD_GROUP is complete this is freed by a call to MPI_GROUP_FREE.

To prove all of this is correct the code calls MPI_COMM_SIZE and MPI_COMM_RANK for each communicator. The output from a typical run is below:


       MPI_COMM_WORLD         WG1             WG2
  sizes are  0,  4  new WG1  0,  3  new WG2 -1, -1
  sizes are  1,  4  new WG1  1,  3  new WG2  1,  3
  sizes are  2,  4  new WG1  2,  3  new WG2  2,  3
  sizes are  3,  4  new WG1 -1, -1  new WG2  0,  3

Note the list RANKS is created in such a way that any process can be the input master or the output master. There are a number of enquiry functions which allow users to determine the relationship between groups of processes in terms of rank and also to determine if groups or communicators are identical.

Other Uses for Communicators:

Communicators can also be used to separate different types of communication. For example a library could ensure it does not interact with existing code by using a communicator for its communications so that these would be hidden from the remainder of the application.

OCCAM Comparison:

Some readers may remember OCCAM and the way it used channels for communication between processes. Clearly, communicators are similar to OCCAM channels as they allow programmers to isolate data exchanges and prevent cross-communication. Communicators are different as they provide for communication within a group whereas channels provided for explicit communication between specific processes.

The Clock is Ticking on Free Evaluation Period for HPF_CRAFT

All T3E sites: Doug Miles of PGI asked us to post this reminder. PGHPF 2.3 (HPF_CRAFT) may be installed for free evaluation through September 30, 1997. For details, contact Doug at miles@pgroup.com or visit the URL: http://www.pgroup.com/T3E/HPF_eval.htm

PE 3.0 Installed on Yukon

The following software was installed on yukon this week and is available for user testing:


  CF90                   3.0.0.1
  CC (C++ and SCC)       3.0.0.0        (No update releases for C)
  CrayLibs               3.0.0.1
  CrayTools              3.0.0.1

If you use PE3.0, be sure to recompile all of your code. Also, create version 3.0 app.rif files to use with the 3.0 version of apprentice, and if you use the 2.0 version of apprentice, make sure to use version 2.0 app.rif files.

Please contact ARSC user services if you have any problems.

The PE3.0 software is only available if explicitly invoked, as the older 2.0 versions are still the default. To access the new 3.0 software, use the following command:


  module swap PrgEnv PrgEnv.30

You should see the following message:


  Switching 'PrgEnv' to 'PrgEnv.30'...ok.

These will remain your default compilers/libraries until your session ends or until you explicitly change them. To return to the default PE, use the command:


  module swap PrgEnv.30  PrgEnv

At any time, you may use the module list command to determine what version you are running. After switching to PE3.0, your environment will look something like this:


  yukon$ module list
  Currently Loaded Modulefiles:
            1) modules              6) CC.3.0.0.0          11) nqe
            2) craylibs.3.0.0.1     7) CCmathlib.2.0.1.0   12) mpt
            3) craytools.3.0.0.1    8) CCtoollib.2.0.1.0   13) PrgEnv.30
            4) cf90.3.0.0.1         9) cam.2.3.0.0        
            5) scc.6.0.0.0         10) cvt                

You may also query the compilers. Typing:


  CC -V

should report a version number of 3.0.0.0 , and


  f90 -V

should report a version number of 3.0.0.1 .

CF90 3.0 features

As announced in Newsletter 121 , Jeff Brooks of the Benchmarking Group at CRI sent us a copy of "The Benchmarker's Guide to Single-Processor Optimization for CRAY T3E Systems." With the installation of PE3.0, the optimizations are now available to ARSC users. The guide is still available in postscript via anonymous ftp to: ftp.arsc.edu . It is in the directory: pub/mpp/docs , and is named: bmguide.ps.Z .

What follows is Cray's list of CF90 features. (The corresponding document for CC programmers is: yukon: /opt/ctl/CC/3.0.0.0/CC30.news .)


  From:  /opt/ctl/cf90/3.0.0.1/CF9030.news
-----------------------------------------------------------------------
CF90 Programming Environment 3.0 has been installed on this system.
This environment contains the following products:

 o CF90 3.0 compiler

 o CrayLibs 3.0

 o CrayTools 3.0


CF90 features
=================

This section describes new CF90 features for CRAY T3E and Cray PVP
systems. The features that are supported on CRAY T3E systems only or on
Cray PVP systems only are identified in the section titles.

New f90 command options
+++++++++++++++++++++++++++++

New f90 command options are as follows:

-s cf77types

   Maps data types to only the standard intrinsic FORTRAN 77 types.
   This option replaces the -si option. The -si option will not be
   supported after the CF90 4.0 release.

-Wa"assembler_opt"

   Passes assembler options directly to the assembler.

-e 0

   Enables initialization of all undefined local variables to zero.

-d 0

   Disables initialization of all undefined local variables to zero.

-O pipeline0

   Specifies no pipelining (default) (CRAY T3E systems only).

-O pipeline1

   Specifies safe pipelining (CRAY T3E systems only).

-O pipeline2 (deferred)
-O pipeline3 (deferred)

-a pad

   Offers both fully automatic and semiautomatic methods to add padding
   following user-declared arrays in Fortran 90 static and common block
   data space.


Duplicate declarations permitted
++++++++++++++++++++++++++++++++++++++

Duplicate declarations are now permitted for certain attributes. This
is a nonstandard feature. You can assign a type or attribute to the
same object twice, but the declarations must be the same for both.
Duplicate declarations are permitted for the following attributes:

 ALLOCATABLE
 AUTOMATIC
 DIMENSION
 EXTERNAL
 INTENT
 INTRINSIC
 OPTIONAL
 POINTER
 PRIVATE
 PUBLIC
 SAVE
 TARGET
 TYPE

Support for ELEMENTAL and PURE procedures
+++++++++++++++++++++++++++++++++++++++++++++++

ELEMENTAL and PURE are new prefix specifications on FUNCTION and
SUBROUTINE definitions. For more information, see the Fortran Language
Reference Manual, Volume 1, publication SR-3902, Fortran Language
Reference Manual, Volume 2, publication SR-3903, and Fortran Language
Reference Manual, Volume 3, publication SR-3905. These specifications
are part of the Fortran 95 standard.

New default type INTEGER(KIND=8)
++++++++++++++++++++++++++++++++++++++

INTEGER(KIND=8) is now the default integer kind type, replacing
INTEGER(KIND=6). INTEGER(KIND=6) is no longer supported. A new
command-line option, -O (no)fastint, has been added to disable or
enable fast integer comparison and fast multiply/divide sequences. This
option affects only the default variables and constants. If you
explicitly declare a variable or constant of a specific integer type,
it will remain that type.

COPY_ASSUMED_SHAPE compiler directive
+++++++++++++++++++++++++++++++++++++++++++

The COPY_ASSUMED_SHAPE directive generates a copy of assumed shape
dummy arguments so that optimizations that depend on a contiguous array
can be performed. On entry to the routine, the compiler generates a
copy to a temporary location and on exit, copies the values back to the
assumed shape array. All references to the assumed shape array within
the body of the routine are replaced with references to the temporary
location. This directive can be specified to apply to either all
assumed shape dummy arguments in the current scope or to specific dummy
arguments. For more information, see the CF90 Commands and Directives
Reference Manual, publication SR-3901.

AUTOMATIC attribute
+++++++++++++++++++++++++

The AUTOMATIC attribute, a common FORTRAN 77 extension, is now
supported. The presence of an AUTOMATIC attribute specifies that a
variable's storage is stack based (that is, the variable is not defined
after the procedure ends). The AUTOMATIC attribute can be applied to a
variable name or an array declaration with an explicit-shape
specification list or a deferred-shape specification list. It can be
used on a type or declaration statement as one of the attribute
specifications. This feature is an extension to the Fortran 90
standard.

!DIR$ NOPATTERN and !DIR$ PATTERN directives
++++++++++++++++++++++++++++++++++++++++++++++++++

The CF90 compiler supports new compiler directives, !DIR$ NOPATTERN and
!DIR$ PATTERN. !DIR$ NOPATTERN tells the compiler to discontinue
pattern matching until either the end of the current program unit is
reached or until the occurrence of a !DIR$ PATTERN directive. For more
information, see the CF90 Commands and Directives Reference Manual,
publication SR-3901.

Improved vector scheduling of loops that contain adds and multiplies
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

The vector performance of many loops is enhanced by an improved
chaining method between vector addition and multiplication operations.
This enhancement is also available on Cray Standard C and Cray C++
compilers.

Support for 32-bit HCOSS
++++++++++++++++++++++++++++++

Calls to SIN and COS with identical 32-bit floating point operands are
now merged into a single call to HCOSS. This is already being done for
64-bit types. This optimization is performed by default along with all
other scalar optimizations. This optimization is also available on Cray
Standard C and Cray C++ compilers.

 IVDEP directive supported on CRAY T3E systems
++++++++++++++++++++++++++++++++++++++++++++++++++++

The IVDEP directive is now supported on CRAY T3E systems, allowing
loops to be considered for loop splitting, intrinsic vectorization,
CACHE_BYPASS, and other optimizations. This directive is also available
on Cray Standard C and Cray C++ compilers.

 CACHE_BYPASS memory references on CRAY T3E systems
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

For CACHE_BYPASS memory references on CRAY T3E systems, the following
directive provides a semiautomatic method for programmers to run local
memory references for arrays through E-registers:

!DIR$ CACHE_BYPASS array-name[,array-name]*

Loops preceded with this directive can obtain run-time reductions of up
to 50 percent.

This directive is also available on Cray Standard C and Cray C++
compilers.

 New command-line option for vectorization on CRAY T3E systems
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
The f90 command-line option, -O vectorn, is now supported on CRAY T3E
systems. This option directs the compiler to use vector versions of
some intrinsic functions and operations when they are found in
vectorizable loops. Currently, the vector1, and vector2 options are
deferred on CRAY T3E systems. For more information on this feature, see
Section 3.4.8. This option is also available on Cray Standard C and
Cray C++ compilers.

Update on Cray Anti-dumping Suit

[ This is taken from a Cray press release. For the entire text, see: http://www.cray.com/news/9708/nec.html ]


 >
 >  Cray Research Welcomes Commerce Department Findings In 
 >  Antidumping Suit
 >
 >
 >  Court of International Trade Rules Commerce Department
 >  Investigation Fair
 >
 >
 >  EAGAN, Minn., August 21, 1997 - Cray Research, the supercomputing
 >  subsidiary of Silicon Graphics, Inc. (NYSE:  SGI) welcomed
 >  decisions handed down yesterday in the Court of International Trade
 >  and today at the U.S.  Department of Commerce that affirm Cray's
 >  claims in an antidumping complaint against the Japanese
 >  manufacturers of vector supercomputers.
 >
 >  The Court of International Trade ruled yesterday that the
 >  Department of Commerce investigation of Cray's antidumping
 >  complaint was fair and conducted in compliance with the rules and
 >  procedures of U.S.  antidumping law. In a lawsuit filed in October
 >  1996, NEC Corporation of Japan sought to remove the Department of
 >  Commerce from investigating Cray's antidumping complaint on the
 >  grounds that the Commerce Department investigation was tainted by
 >  prejudgment.
 >
 >

Next Newsletter, September 12

Both editors will be out of town for a bit. Tom is getting married and Guy is taking a trip to England to visit friends and family and to work on his farm-house (which, according to the existing documents, was built in either 1740 or 1770). Wish us luck!

Quick-Tip Q & A


A: {{ How can you merge two versions of a C or Fortran source into a
      single file so that you may compile either version by simply
      defining (or undefining) a preprocessor variable?  }}

      # diff -D 
      # In "C", for instance:

      diff -D DEBUG version1.c version2.c > merged.c

      # To compile version1 given "merged.c", you would issue 
      # this command:

      cc merged.c

      # To compile version2, you would either #define DEBUG in 
      # merged.c or simply issue this command:

      cc -D DEBUG merged.c


      # To invoke the pre-processor from CF90, the source file 
      # must end in ".F".  For instance:

      diff -D DEBUG version1.f version2.f > merged.F

      f90 merged.F                    # For version1 
      f90 -D DEBUG merged.F           # For version2


Q: In an SPMD program running on the T3D/E, what is a good way to
   exit the processes on all PEs when one of them encounters a fatal
   error condition?


[ Answers, questions, and tips graciously accepted. ]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top