ARSC T3D Users' Newsletter 107, October 4, 1996
T3D Memory Layout
[ Thanks to Mark Dalton of CRI in Los Alamos for sending this in (I have made a few minor changes). The T3D Newsletter has touched on memory management in several previous issues (see the list at the end), but has not discussed the mppldr directives before. ]
The manpage on 'mppldr' describes four loader directives which can be used to control the T3D memory layout. The manpage description:
defheap Specifies the minimum heap size and heap increment value
for all programs.
defsheap Used in the default directives file to establish a minimum
heap size and heap increment value for all programs.
defstack Sets the default program stack size.
defsstk Used in the default directives file to establish a minimum
distributed stack size for all programs.
My description:
mppldr:
-i mppldr_opts
where the file "mppldr_opts" contains:
defheap=# <-- Initial Private Heap
defsheap=# <-- Initial Shared Heap
defstack=# <-- Initial Private Stack
defsstck=# <-- Initial Shared Stack
For Example:
cc -X <Num_PEs> -Wl"-i mppldr_opts" file_name.c
Then:
mppsize ./a.out <-- The executable must have a fixed number of PE's.
NOTE: You must fix the Number of PE's or it will not give you useful
information on the actual size of the executable. You can do this
at compile/link time, or use mppfixpe.
The following shows sample output from mppsize (taken from the Cray Research Service Bulletin on mppsize):
*************************************************************************
Startup program is: #!/mpp/bin/mppexec
Number of PEs: 1 (fixed)
OS partition required? no
H/W partition required? no
barriers: 1
eurekas: 0
Shape hint (0,0,0)
Transfer address: 2000000000
-----------------------------------------------------------------------------
Segment Size T S P G A L D Disk Length Offset Zeroed
2000000000 code B1DA0 T N 5 0 N 0 0 570 B1DA0 0 0
A000000000 shared data C50 D Y 6 0 N 0 4C50 CE268 0 0 850
4000000000 private data DE280 D N 6 + N 0 E2280 B2310 1BF58 0 2328
6000000000 private stack C0000 S N 6 - Y 0 C4000 0 0 0 0
8200000000 registers 0 R N 6 0 N 0 0 0 0 0 0
Symbols CE268 23B58 0
Dot o's 0 0 0
Directives 0 0 0
843000 (decimal) bytes will be initialized from disk.
146264 (decimal) bytes are required for the symbol table.
1392 (decimal) bytes are required for the header.
990656 (decimal) bytes total
*************************************************************************
The CRSB on mppsize appears in Newsletter 27 , and describes how to interpret this information. Basically, add up the size of: code, shared data, private data, and private stack. These are the default size for the data and stacks and they can allocate more, depending on the code. However some codes will need more of a particular area.
For more information:
- Use 'docview' or 'cdoc' to view the mppldr section of the segldr manual.
- man mppldr, mppsize, mppfixpe, IHPSTAT
- For mppsize: Newsletters 23 , 27
- For IHPSTAT: Newsletter 97
- For mppfixpe: Newsletter 101
- For mppldr directives: Newsletter 103
Barriers AND Eurekas Revisited
Two points:
-
Barriers on the T3E:
Frank Chism of CRI tells me that he has run some of his barrier testing codes on the T3E, and they perform as expected. That is, as they perform as on the T3D.
-
A barrier call on a PE invalidates a eureka event set previously on that PE.
The following one-sentence paragraph is from docview, mpp.fortran.62, and appeared (with context) in Newsletter 104 :
"Events in eureka mode cannot be posted across barriers."
The manual didn't explain what would happen if one violated this policy, but, as you can imagine, it happened. A reader sent in a test which used barriers and eurekas and seemed to behave inconsistently. It turns out that its barrier call was cancelling an event.
The following program attempts to show this situation. Since it is testing the barrier network, it must avoid using it. To synchronize the PEs, it, therefore, uses the sleep command: there may be a better approach, but this seems to work.
Here's what it does:
All PEs clear the eureka event (clear_event() is always required). A barrier call is included in the clear_event() call, so we know we're synched at that point. Then,
- PEs 2 and 3 test for a eureka event
- PE0 sets a eureka event
- PEs 2 and 3 test
- PE1 sets a barrier
- PEs 2 and 3 test
- PE0 sets a barrier
- PEs 2 and 3 test
The output:
=== PE0: before PE0 does set_event PE2: test_event returns FALSE PE3: test_event returns FALSE === PE0: after PE0 does set_event / before PE1 does set_barrier PE2: test_event returns TRUE PE3: test_event returns TRUE === PE1: after PE1 does set_barrier PE3: test_event returns TRUE PE2: test_event returns TRUE === PE0: after PE0 does set_barrier PE2: test_event returns FALSE PE3: test_event returns FALSE
The program:
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
void print_test_results (int self);
int main( int argc, char** argv ) {
int npes,self;
npes = _num_pes();
self = _my_pe();
clear_event(); /* contains barrier */
if( self == 0 ) {
printf("=== PE%i: before PE0 does set_event\n",self);
fflush(stdout);
sleep (2);
set_event();
printf("=== PE%i: after PE0 does set_event / before PE1 does set_barrier\n",self);
fflush(stdout);
sleep (2);
/* PE1 sets its barrier here. */
sleep (2);
set_barrier();
printf("=== PE%i: after PE0 does set_barrier\n",self);
fflush(stdout);
}
else if( self == 1 ) {
sleep (2);
/* PE0 does set_event() here */
sleep (2);
set_barrier();
printf("=== PE%i: after PE1 does set_barrier\n",self);
fflush(stdout);
sleep (2);
/* PE0 sets its barrier here */
}
else {
sleep (1);
print_test_results (self);
sleep (2);
print_test_results (self);
sleep (2);
print_test_results (self);
sleep (2);
print_test_results (self);
}
}
void print_test_results (int self) {
/* if( test_barrier() )
printf("PE%i: test_barrier returns TRUE\n",self);
else
printf("PE%i: test_barrier returns FALSE\n",self);
*/
if( test_event() )
printf("PE%i: test_event returns TRUE\n",self);
else
printf("PE%i: test_event returns FALSE\n",self);
fflush(stdout);
}
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Quick-Tip Q & A
A: {{ What's an easy way to remove all the 'core' and 'mppcore' files in
any of your directories (but have 'rm' ask before removing)? }}
# This 'find' command starts its search from "." -- the current
# directory. It descends recursively into all subdirectories,
# searching for files named core or mppcore (the "-o" specifies "or",
# and the escaped parentheses are required). When it finds one, it
# executes the command "rm -i {}", substituting the complete path of
# the file for the token, "{}".
find . \( -name core -o -name mppcore \) -exec rm -i {} \;
# A reader sent in this:
I use a ksh function in my .profile file:
menage () { find . \( -name core -o -name '*.o' -o -name mppcore -o
-name '*.l' -o -name '*.T' \) -atime +7 -exec $* {} \; ; }
"-fstype nfs" is able to exclude NFS files but does not work on
CRAY. It's a shame, because this function can be dangerous if you
access NFS files you did not want to remove after having forgotten
them.
Usage:
menage ls -l
or
menage rm -i
Q: How do you figure out what version of a library your code has been
loaded with, or what version of a library you are using?
[ Answers, questions, and tips graciously accepted. ]
Current Editors:
E-mail Subscriptions:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
-
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
-
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
