ARSC T3E Users' Newsletter 125, September 12, 1997

Hardware Upgrade Scheduled for ARSC T3E

Yukon will receive the following hardware upgrade later this month:


  Current Configuration            Upgrade Plan
  ----------------------           -------------------------
  96 PEs (88 User PEs)             104 PEs (Total User PEs--TBA)
  128 MB/PE                        256 MB/PE
  300 MHz clock                    450 MHz clock

An important effect of this upgrade will be to make yukon streams safe .

Yukon users: MOTD and news will keep you up-to-date on the downtime plan. The upgrade and testing process is scheduled to begin on Sunday 9/21 and will take approximately 2 days.

File Assignment: Dangerous but Useful

The assign command lets you modify many of the characteristics with which files associated with Fortran file unit numbers will be opened. For instance:


  # Name of file associated with unit 41 will be "junk.text".
  assign -a junk.text u:41               

  # Numeric data in unit 41 will be in IEEE format.
  assign -N IEEE u:41               

ASNUNIT and ASSIGN are equivalent Fortran callable subroutines. For instance:


  CALL ASNUNIT (41, "-a junk.text", IERR)

  CALL ASNUNIT (41, "-N IEEE", IERR)

This is very useful. With one little command (or a simple code change), your old executable can write in IEEE format, read from files with new names, and handle a variety of other situations (see man assign ).

This is also very dangerous, as one of ARSC's biggest users discovered recently. He learned that assignments (made either way) are not cleared after first use or program termination, as he had assumed. They persist and affect later runs of any Fortran program which uses that same unit numbers. Suddenly, one of his programs (which had always worked) started crashing because of an earlier, forgotten assignment.

Here's a simple demonstration of the problem. tstgood.f opens a file to unit 41 and writes formatted reals to it. tstset.f is some unrelated program which, for reasons of its own, assigns IEEE formatting to unit 41. Tstgood crashes once tstset is run:


    yukon$ cat tstgood.f
           program tst_good
           implicit none
    
           real    x
           integer j, IER
    
           open (unit=41, file="test.in", form="FORMATTED", 
         $  status="UNKNOWN")
    
           do j=1,10
            x = real(j)/100.
             write (41, '(f8.4)') x
           enddo
           
           print*, "Unit 41 written successfully."
    
           close (unit=41)
           
           end

    yukon$ ./tstgood
     Unit 41 written successfully.

    yukon$ cat tstset.f
           program tst_bad
           implicit none
           integer IERR
    
           CALL ASNUNIT(41,'-N IEEE',IERR)
           print*,"Assignment on unit 41 made."
           end

    yukon$ ./tstset
     Assignment on unit 41 made.

    yukon$ ./tstgood
    
    lib-1069 ./tstgood: UNRECOVERABLE library error 
      The file cannot be opened for FORMATTED I/O.
    
    Encountered during an OPEN of unit 41
    Fortran unit 41 is not connected
    Error initiated at line 666 in routine '_f_open'.
    SIGNAL: Abort ( from process 47828 )
    
     Beginning of Traceback (PE 0):
      Interrupt at address 0x8000323f0 in routine '_lwp_kill'.
      Called from line 30 (address 0x800031bf0) in routine 'raise'.
      Called from line 125 (address 0x800009784) in routine 'abort'.
      Called from line 113 (address 0x8000a2efc) in routine '_ferr'.
      Called from line 666 (address 0x8000b61f4) in routine '_f_open'.
      Called from line 344 (address 0x8000b7bc4) in routine '__OPN'.
      Called from line 381 (address 0x8000b88f4) in routine '_OPEN'.
      Called from line 7 (address 0x80000135c) in routine 'TST_GOOD'.
      Called from line 449 (address 0x800000b58) in routine '$START$'.
     End of Traceback.
    Abort(coredump)
    yukon$ 

Here's a demonstration of the naming danger. tstgood.f has this open statement:


         open (unit=41, file="test.in", form="FORMATTED", 
       $  status="UNKNOWN")

But assign can override the specified file name:



    yukon$ ls -l test.*
    cmd-1025 ls: test.*: No such file or directory

    yukon$ ./tstgood   
     Unit 41 written successfully.

    yukon$ ls -l test.*
    -rw-------   1 baring   staff      90 Aug 15 11:48 test.in

    yukon$ rm test.*

    yukon$ assign -a test.other-name u:41

    yukon$ ./tstgood
     Unit 41 written successfully.

    yukon$ ls -l test.*                              
    -rw-------   1 baring   staff      90 Aug 15 11:49 test.other-name

To identify and clear undesired, residual assignments, here are some useful calls:


  # List all current assignments.
  assign -V 

  # Clear all current assignments.
  assign -R 

  # Clear all current assignments on unit ###.
  assign -R ###

  # Fortran: clear all assignments
  CALL ASNRM (IERR)

  # Fortran: clear assignments on unit 41
  CALL ASNUNIT (41, "-R", IERR)


For those who like to see the inner workings, your assignments are stored to the file:


  $TMPDIR/.assign.

A call to assign -V names this file for you, as well as listing the current assignments. For instance:


  yukon$ assign -V
  # ASSIGN ENVIRONMENT FILE=/tmp/baring/.assign
  assign -t u:31

MPI: Fortran and C Differences

There are differences between the C and Fortran interfaces to MPI which are not always documented. Taking MPI_BARRIER as an example, in C the code contains:


  ierr = MPI_BARRIER(MPI_COMM_WORLD);

Whereas in Fortran it contains:


  call MPI_BARRIER(MPI_COMM_WORLD, ierr)

Note the difference: in C, the error code is returned as the result of the function, while in Fortran, it is an additional argument. Many texts, examples, and manual pages, including those from CRI, quote the C arguments only. It becomes easy to forget the extra argument, IERR, when developing Fortran code.

The potential for error is compounded since Fortran code which omits the IERR argument will work on some systems and fail on others. The CRAY T3D/T3E MPI library requires the argument: if IERR is omitted, the following error message is produced:


  SIGNAL: Operand range error ( [21] memory management fault)

   Beginning of Traceback (PE 1):
    Interrupt at address 0x800137218 in routine 'MPI_BARRIER'.  Called
    from line 63 (address 0x800001500) in routine 'PROG1'.  Called from
    line 449 (address 0x800000b58) in routine '$START$'.  End of
   Traceback.  Operand range error (core dumped)

MPI: Mixed Language Programs

Many programmers mix C and Fortran to get the best of both languages. Typically C is used to create memory storage structures and to interface with graphical/windowing systems while Fortran is used for the numerically intensive computations and for access to Fortran libraries.

When using MPI in a mixed-language program, the call to MPI_INIT must be in the same language as the main program. The MPI standard does not say what a program can do before an MPI_INIT or after an MPI_FINALIZE. Our advice is to make MPI_INIT the first call and MPI_FINALIZE the last call in the main program.

Here are two examples, first, a Fortran main calling a C function and then a C main calling a Fortran function:


EXAMPLE 1:
----------------------------- Fortran main --------------------
      program f_main
      implicit none
      include 'mpif.h'
      integer mype, totpes, ierr, i, pe

      integer GETPEINFO
      integer ierr

      call MPI_INIT(ierr)
      ierr = GETPEINFO (mype, totpes)         ! Call C function

      do i=0,totpes                           ! Respond in order
        pe=i
        call MPI_Bcast(pe, 1, MPI_INTEGER, 0, MPI_COMM_WORLD,ierr)
        if (pe.EQ.mype) then 
          write(6,*) 'From Fort: mype is ',mype,' of total ',totpes
        endif
      enddo
  
      call MPI_FINALIZE(ierr)
      end

----------------------------- C function ----------------------
#include <stdio.h>
#include <mpi.h>

int GETPEINFO (int *mype, int *totpes) {
  int err;

  err=MPI_Comm_rank(MPI_COMM_WORLD, mype);
  err=MPI_Comm_size(MPI_COMM_WORLD, totpes);

  printf ("from C: mype: %d totpes: %d\n", *mype, *totpes);

  return err;
}

----------------------------- Compile, link, and run: ---------
yukon$ cc -c c_func.c
yukon$ f90 c_func.o f_main.f
yukon$ mpprun -n4 ./a.out
from C: mype: 2 totpes: 4
from C: mype: 1 totpes: 4
from C: mype: 0 totpes: 4
from C: mype: 3 totpes: 4
 From Fort: mype is  0  of total  4
 From Fort: mype is  1  of total  4
 From Fort: mype is  2  of total  4
 From Fort: mype is  3  of total  4



EXAMPLE 2:
----------------------------- C main --------------------------
#include <stdio.h>
#include <mpi.h>

main(int argc, char **argv) {
  int err, mype, totpes, pe; 
  fortran int GETPEINFO();

  err=MPI_Init(&argc, &argv);

  err=GETPEINFO (&mype, &totpes);      /* Call Fortran func */

  for (pe=0; pe<totpes; pe++) {        /* PEs respond in order */
    err=MPI_Barrier (MPI_COMM_WORLD);
    if (pe==mype) 
       printf ("from C: mype: %d totpes: %d\n", mype, totpes);
  }

  err=MPI_Finalize();
}

----------------------------- Fortran function ----------------
      integer function GETPEINFO (mype, totpes)
      implicit none
      include 'mpif.h'
      integer mype, totpes, ierr, pe

      call MPI_COMM_RANK(MPI_COMM_WORLD, mype, ierr)
      call MPI_COMM_SIZE(MPI_COMM_WORLD, totpes, ierr)

      write(6,*) 'From Fort: mype is ',mype,' of total ',totpes

      GETPEINFO = ierr

      end

----------------------------- Compile, link, and run: ---------
yukon$ f90 -c f_func.f
yukon$ cc f_func.o c_main.c
yukon$ mpprun -n4 ./a.out
 From Fort: mype is  0  of total  4
 From Fort: mype is  2  of total  4
 From Fort: mype is  3  of total  4
 From Fort: mype is  1  of total  4
from C: mype: 0 totpes: 4
from C: mype: 1 totpes: 4
from C: mype: 2 totpes: 4
from C: mype: 3 totpes: 4

A common problem when mixing C and Fortran is giving the names of functions in the wrong case:

When mixing C and Fortran on CRAY machines, give the shared functions UPPERCASE names.

Fortran is a case-insensitive language, and to effect this, CRI compilers shift Fortran code to all uppercase. The C languange and the linker, however, are case-sensitive. Thus, if the C code provides or expects a function named with lowercase letters, the linker will be unable to find the corresponding Fortran name (because it will have been shifted to all uppercase).

Some Fortran compilers (including SGI's) effect the case-insensitivity by shifting everything to lowercase rather than uppercase. For this (and other reasons) mixed language programs can take some effort to port between platforms.

Here are CRI's guidelines for mixing Fortran and C/C++ (from SR-2074 3.0):


  8.1.2 Calling Fortran functions and subroutines from a C or C++
  function

  This subsection describes the following aspects of calling Fortran
  from C or C++: requirements and guidelines, MPP considerations,
  argument passing, array storage, logical and character data, and
  accessing blank common from C and C++ programs.

  8.1.2.1 Requirements

  Keep the following points in mind when calling Fortran functions from
  C/C++:

     * Fortran uses the call-by-address convention, and C/C++ uses the
       call-by-value convention, which means that only pointers should
       be passed to Fortran subprograms. See Section 8.1.2.2.

     * Fortran arrays are in column-major order, and C/C++ arrays are
       in row-major order. This indicates which dimension is indicated
       by the first value in an array element subscript. See Section
       8.1.2.3.

     * Single-dimension arrays of integers and single-precision
       floating-point numbers are the only aggregates that can be
       passed as parameters without changing the arrays.

     * Fortran character pointers and C/C++ character pointers are
       incompatible. See Section 8.1.2.4.

     * Fortran logical values and C/C++ Boolean values are not fully
       compatible. See Section 8.1.2.4.

     * External C/C++ variables are stored in common blocks of the same
       name, making them readily accessible from Fortran programs if
       the C/C++ variable is uppercase.

     * When declaring Fortran functions or objects in C/C++, the name
       must be specified in all uppercase letters, digits, or
       underscore characters and consist of 31 or fewer characters.

     * In C, Fortran functions can be declared using the fortran
       keyword (see Section 3.4). The fortran keyword is not available
       in C++.  Instead, fortran functions must be declared by
       specifying extern "C".

     * In C++, the main function must be written in C++.

     * On Cray MPP systems, the C/C++ language float type does not
       match the Fortran REAL type. The float type is 32 bits for C/C++
       on Cray MPP systems; the Fortran REAL type is 64 bits. However,
       Fortran includes the REAL*4 type that matches the C language
       float type.


  8.1.3 Calling a C/C++ function from an assembly language or Fortran
    program

    A C/C++ function can be called from Fortran or assembly language.
    When calling from Fortran, keep in mind the information in Section
    8.1.2.

    When calling a C++ function from Fortran or assembly language, the
    C++ function must be declared with extern "C" storage class, the
    main function must be written in C++, and the program must be
    linked with the CC command.  C++ main is responsible for
    initializing the static constructors for C++ functions.

Quick-Tip Q & A


A: {{ In an SPMD program running on the T3D/E, what is a good way 
      to exit the processes on all PEs when one of them encounters a
      fatal error condition? }}

  # MPI_Abort() works but causes a core dump. A more elegant solution
  # follows:
  #
  #
  # Contributed by Ken Steube of SDSC:
  # --------------------------------------
  # 
  # You have to kill the PID of the process running on PE 0.  The C
  # program included below does the job. A better solution would be to
  # install a signal handler for SIGUSR1, for example, and from it call
  # mpi_finalize and anything else you want to do before exiting.
  # 
  # For Fortran, the PID is available through pxfgetpid, and you access
  # the kill system call through either the KILL subroutine or function.  
  # 
  # 
  # /*-------------------------------------------------------------
  #    Shows how to have one processor kill the entire parallel 
  #         job on the T3E
  # -------------------------------------------------------------*/
  # 
  # #include <stdio.h>
  # #include <sys/types.h>
  # #include <signal.h>
  # 
  # #include <mpi.h>
  # 
  # int
  # main(argc, argv)
  # int argc;
  # char **argv;
  # {
  #    int i, iam, nPEs, pid0;
  # 
  #    MPI_Init(&argc, &argv);
  #    MPI_Comm_rank(MPI_COMM_WORLD, &iam);
  #    MPI_Comm_size(MPI_COMM_WORLD, &nPEs);
  # 
  #    /* Replicate PE zero's PID on all PEs */
  #    pid0 = getpid();
  #    MPI_Bcast(&pid0, 1, MPI_INT, 0, MPI_COMM_WORLD);
  # 
  #    for(i=0;i<10;i++) {
  #            printf("Hello from PE %i at step %i\n", iam, i);
  #            sleep(1);
  #            if (iam == 1 && i == 2) kill(pid0, SIGINT);
  #    }
  # 
  #    /* MPI_Finalize(); */
  # }
  # 
    
  
Q: Assume you have dozens of source files, many of which reference
   the same global variable.  You decide to change the name of that
   variable.  Is there an easy way to do this?


[ Answers, questions, and tips graciously accepted. ] 

Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top