ARSC T3D Users' Newsletter 98, August 2, 1996
Porting T3D PVM Codes to Network PVM
[ Don Morton, visiting ARSC from Carmeron University, contributes this article. See his contact information, provided below. ]
Last week, Richard Barrett of LANL discussed numerous issues in porting Network PVM codes to the T3D. I have been interested in this area, and in the reverse scenario - writing T3D codes so that they'll easily port to a network PVM environment. Clearly, the issues mentioned by Richard, such as sending messages to TID's rather than PE's, are important when writing portable code. I have found that the most difficult task in writing portable PVM code is that of dealing with "spawning" vs. "non-spawning" environments. Although Richard touched upon this, I'd like to go a little deeper.
If we are writing code for the T3D, we are restricted to a Single-Program Multiple-Data (SPMD) paradigm - that is, we have the same executable running on each processor. We tend to address other processes by a PE number, or at the very least, the PVM TID of a PE. We don't have an inherent master-slave relationship between processes, although traditionally we view PE0 as a "master" process. If we want to run our T3D SPMD code in a network PVM environment, where spawning is necessary, then it's necessary to simulate the T3D behavior of a non-spawning environment. The following function is designed to do just that. It has been utilized in a T3D environment AND in a cluster of PC's running Linux.
First, the function prototype (in Fortran) looks like
subroutine startup(numprocs, mype, tidlist, icode)
integer numprocs
integer mype
integer tidlist(0:numprocs-1)
integer icode
The behavior of the function is as follows:
-
We assume that the "master" process (the initial process in a network environment, PE0 on the T3D) somehow knows how many processes are to run. This might be read in from a file, come from a command-line argument, etc. The "master" process regards "numprocs" as an (in) parameter. All other processes regard this as an (out) parameter. Additionally, "mype", "tidlist", and "icode" are all (out) parameters.
-
From each process' point of view, startup() is called with uninitialized parameters (with the exception of "numprocs" on the master process), and upon return from the function, every process has a value for each of the parameters. Thus, each process knows how many total processes are in the virtual machine, they each know their logical position in the system, and they each have an identical list of TID's, so they can easily communicate with each other by referencing the appropriate logical PE in "tidlist" - if Process 3 wants to communicate with Process 7, it simply references tidlist(7). -
"icode" is an error flag. If the call to startup() was successful, icode stores the number of total processes, otherwise it stores some negative value (note - this aspect hasn't been fully implemented in the following code).
The code (see below), "startup.F", is preprocessed with gpp on the T3D and cpp in other environments, and is conditionally compiled depending on values of preprocessing macro _CRAYMPP. _CRAYMPP is automatically defined if you're using the Cray MPP compiler.
The entire function is based on processes joining a global group, then obtaining information about themselves and other processes within the global group. Conditional compilation enables/disables code for spawning and T3D environments. The following differences are addressed through conditional compilation:
-
The global group name, "ALLGROUPNAME."
-
Variables declared for use in spawning environments are not used in the T3D environment.
-
In T3D environments, we're not able to join the global group (via pvmfjoingroup()), or obtain the instance within the global group (via pvmfgetinst()). Thus, the T3D-specific function, pvmfgetpe() is used. In spawning environments, we join the global group and get our instance in the group with pvmfjoingroup().
-
A good portion of the code in the function is only used in a spawning environment. After the spawn, we force a synchronization by having child processes send a message to the parent. Without this forced synchronization, future group operations (e.g. pvmfbcast(), pvmfbarrier(), pvmfgettid()) may fail, as the parent process may attempt to call these functions before others have even joined the group. Trust me, I've encountered this way too many times!
After the spawning code, there is no difference in T3D and spawning implementations. The rest of the code is centered on insuring that every process knows how many total processes there are, then each process obtains the TID's of every other process. One might be tempted to use pvmfgsize() to find the number of total processes, but, as above, this "forced" synchronization is necessary. Without it, it is possible for one process to reach this point before others have joined the global group, producing different values of "numprocs" for different processes! Again, trust me! Been there, done that!
The source code for the function is below, followed by the Makefile, which in turn is followed by "test.f," a simple program which calls the startup() subroutine. Note that in the main program there is no dependence on the architecture. This is all handled in startup(), and as long as basic PVM operations are utilized, it will be portable.
ccccccccccccccccccccccccc startup.F ccccccccccccccccccccccccccccccccc
#ifdef _CRAYMPP
#define ALLGROUPNAME PVMALL
#else
#define ALLGROUPNAME "alltasks"
#endif
subroutine startup(numprocs,
& mype,
& tidlist,
& icode)
implicit none
ccc Header files
include 'fpvm3.h'
ccc-----------------------------------------------------------
ccc Arguments
integer numprocs ! number of processes to start up
! (if spawning).
integer mype ! My process number ( 0 < p < numprocs-1))
integer tidlist(0:numprocs-1) ! List of PVM task id's
integer icode ! Return code
! If successful - number of processes running.
ccc-----------------------------------------------------------
ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
c c
c In PVM environments, we assume that the master process (rank 0) c
c passes 'numprocs' in as an argument. On completion, all processes c
c will have 'numprocs' as an (out) argument, storing the number of c
c processes running. c
c c
ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
ccc-----------------------------------------------------------
ccc Local variables
integer info ! Return code for PVM calls
integer i
integer mytid ! My PVM TID
#ifndef _CRAYMPP
ccc declare variables for spawning environment
character*255 binaryname ! stores name of executable, retrieved
! from arg 0 of command line
integer numspawned ! number of tasks spawned
integer myparent ! PVM TID of parent
#endif
integer MSG_NUMPROCS_BCAST
integer MSG_ACK
parameter(MSG_NUMPROCS_BCAST=888, MSG_ACK=333)
ccc-----------------------------------------------------------
ccc Enroll in PVM
call pvmfmytid(mytid)
ccc CRAY MPP does not allow us to join global group (or use it in
ccc pvmfgetinst(), so just get PE number
#ifdef _CRAYMPP
call pvmfgetpe(mytid, mype)
#else
call pvmfjoingroup(ALLGROUPNAME, mype)
#endif
#ifndef _CRAYMPP
ccc This code executes in a "spawning" environment
if(numprocs .gt. 1) then
if(mype .eq. 0) then ! I'm the parent task
c get name of executable (myself)
call getarg(0, binaryname)
c spawn children
call pvmfspawn(binaryname, PVMDEFAULT, '*',
& numprocs-1, tidlist(1), numspawned)
if(numspawned .ne. numprocs-1) then
write(0,*) 'pvmfspawn() was unsuccessful -- '
write(0,*) ' ', numspawned, ' tasks were spawned'
write(0,*) ' Task error codes:'
do i=1,numprocs-1
write(0,*) ' ', tidlist(i)
enddo
call pvmfhalt()
endif
ccc Now, wait for each child to acknowledge that they've made it
ccc this far in the code - this will allow us to conclude that
ccc everyone has joined group ALLGROUPNAME.
do i=1,numprocs-1
call pvmfrecv(-1, MSG_ACK, info)
enddo
else ! I am one of the newly spawned tasks - ack parent
ccc First, find out who spawned me
call pvmfparent(myparent)
call pvmfinitsend(PVMDATADEFAULT, info)
ccc Just pack any arbitrary integer
call pvmfpack(INTEGER4, MSG_ACK, 1, 1, info)
call pvmfsend(myparent, MSG_ACK, info)
endif
endif
ccc End of code for spawning environments
#endif
ccc At this point, it is assumed that all processes are running - Process 0
ccc should now let all the other processes know how many of them exist
if(mype .eq. 0) then
call pvmfinitsend(PVMDATADEFAULT, info)
call pvmfpack(INTEGER4, numprocs, 1, 1, info)
call pvmfbcast(ALLGROUPNAME, MSG_NUMPROCS_BCAST, info)
else
call pvmfrecv(-1, MSG_NUMPROCS_BCAST, info)
call pvmfunpack(INTEGER4, numprocs, 1, 1, info)
endif
ccc At this point, everyone should has a value for numprocs
ccc Set up global data structures which will allow us to utilize
ccc TID's.
do i=0,numprocs-1
call pvmfgettid(ALLGROUPNAME, i, tidlist(i))
enddo
ccc Everyone sync here
call pvmfbarrier(ALLGROUPNAME, numprocs, info)
ccc Finally, give icode a value - if all goes well, should be
ccc same value as numprocs
icode = numprocs
return
end
=======================================================================
################### Makefile ############################
######### CRAY T3D at ARSC ###############
#SYS_INC=/usr/include/mpp
#FPP=/mpp/bin/gpp
#FPPFLAGS=-FP
#FC=f90
#FFLAGS=-dp
#LD=f90
#LDFLAGS=
#DISPOSITION=
#########################################################
######### LINUX with g77 ###########
SYS_INC=/usr/local/pvm3/include
FPP=/lib/cpp
FPPFLAGS=-P -traditional
FC=g77
FFLAGS=-m486 -Wall
LD=g77
LDFLAGS=-L./ -L/usr/local/pvm3/lib/LINUX -lfpvm3 -lgpvm3 -lpvm3
DISPOSITION=cp runit ${HOME}/pvm3/bin/LINUX/
############################################################
######## end of site-peculiar definitions ################
INCLUDES=-I${SYS_INC}
runit : test.o startup.o
${LD} -o runit test.o startup.o ${LDFLAGS}
${DISPOSITION}
test.o : test.f
${FC} ${FFLAGS} -c test.f
startup.o : startup.f
${FC} ${FFLAGS} ${INCLUDES} -c startup.f
rm -f startup.f
startup.f : startup.F
${FPP} ${FPPFLAGS} startup.F > startup.f
=================================================================
program test
integer np, mype, tids(0:127), ierr
np = 4
call startup(np, mype, tids, ierr)
print *, 'mype is ', mype, ' of ', np,
& ' processes. My tid is ', tids(mype)
end
=======================================================================
Don Morton Email: morton@arsc.edu
Visiting Scientist (summer) Voice: (907) 474-5507
Arctic Region Supercomputing Center Fax : (907) 450-8601
University of Alaska
Fairbanks, AK 99775
http://grizzly.cameron.edu/morton.html
=======================================================================
ARSC T3D User Group Met on August 1st
We had a nice turnout yesterday, despite the rain. Users and staff alike are both excited and apprehensive about the T3E. Excited? Of course. This is new technology that promises to solve bigger problems faster. Apprehensive? Of course. This is new technology that promises startup problems, incompatibilities, and a learning curve.Some Concerns:
- How easily will T3D codes port to the T3E?
- Will there be a T3E version of AVS?
- Will the T3E have a C compiler, or only C++? Users report difficulty compiling public domain and personal C code on various C++ compilers.
- Will existing CRAFT codes port to the T3E? How small a subset is planned? What other implicit programming models will be available?
- What will the file system look like? DMF? CRL?
Some Expectations:
- Improved job scheduling and more immediate access to PEs may be possible, given the T3E's job swapping capability and its lifting of the power-of-two number of PEs per job restriction.
- Turnaround on the Y-MP should improve once separated from the T3D.
- Many users see the T3E as an major upgrade: they are already looking forward to running bigger simulations.
- Everyone looks forward to faster processors, better interconnect bandwidth, and better I/O bandwidth.
Don Morton Migrates South
At any rate, Don plans to start the long drive tomorrow morning. At ARSC, Don's departure is an early sign of approaching winter -- next thing you know, the sandhill cranes will start to fly. We thank Don for his many contributions to ARSC, the T3D User Group, and this Newsletter. We wish him a safe journey, a good year, and look forward to seeing him again next summer.
Current Editors:
E-mail Subscriptions:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
-
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
-
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
