ARSC T3D Users' Newsletter 6, September 30, 1994
List of Differences Between T3D and Y-MP
As a T3D user consultant, I usually get calls when something is not working. It usually isn't long into the conversation when the users says: " ...but it works on the Y-MP." Of course they are different machines but sometimes the differences to the user seem not related to being a MPP. I'd like to start a list of differences between the two machines for future reference. I'll start the list with two items:
- Data type sizes are not the same (Newsletter #5)
- Uninitialized variables are different (this Newsletter (#6))
Uninitialized Variables are Different
This comes up in many different forms but it can be illustrated with a simple program:print *, a endOn the Y-MP, the results is 0. and on the T3D the result is NaN. There are simple solutions and sometimes warnings from the compilers, but it catches a lot of users moving code from the Y-MP to the T3D.
Code Sizes on the T3D
A while ago a user asked: "What is the size of the O.S. on the T3D?". I haven't been able to find an official answer, but I think I have the answer to another related question. Everyone knows that the memory on a ARSC T3D PE is 16MBs, then from knowing the size of the O.S. we can estimate the amount of memory available to the users as:size of largest user program = 16MB - size of O.S.By experiment, we can find a better estimate of the size of the largest possible user program. The easiest place to start is with something as simple as a single PE and a single global array. This is the "Private" model of PE memory configuration. The test program (base.c) is:
static int a[ XXX ];
main( )
{
int xxx;
xxx = XXX;
printf( "total static memory = %d bytes on %d ints\n", sizeof(int)*XXX, xxx );
}
and is executed on the T3D with a shell script such as the following:
#csh
foreach SIZE ( 1 1024 2048 4096 1480000 1520000 1560000 1600000 )
echo $SIZE ;\
sed "s/XXX/$SIZE/" base.c > try.c;\
/mpp/bin/cc -X 1 -c try.c ;\
/mpp/bin/mppldr try.o ;\
mppsize a.out ;\
memsize a.out ;\
a.out
end
We can collect a lot of data quickly and tie together the relations between the following:
- The size of the user's array
- The output from mppsize
- The output from memsize
static int a[ 1 ];
main( )
{
int xxx;
xxx = 1;
printf( "total static memory = %d bytes on %d ints\n", sizeof(int)*XXX, 1 );
}
(I can think of smaller programs, but probably none are very useful programs.)
relevant output from mppsize:
154896 (decimal) bytes will be initialized from disk.
28432 (decimal) bytes are required for the symbol table.
1392 (decimal) bytes are required for the header.
184720 (decimal) bytes total
output from memsize:
Memory size is 1734008 bytesoutput from user program:
total static memory = 8 bytes on 1 intsSo we might say that the space available to the user on the T3D is
16MB - smallest possible program = 16000000 - 1734008 = 14,265,992 bytesthis is close but we have a more accurate estimate below.
Increasing XXX doesn't show a change until XXX = 5 then we have:
relevant output from mppsize:
154928 (decimal) bytes will be initialized from disk.
28432 (decimal) bytes are required for the symbol table.
1392 (decimal) bytes are required for the header.
184752 (decimal) bytes total
So we know:
- The array a[ ] is part of the space initialized from disk
- The space from disk is allocated in 32 byte blocks
mppexec: application too large or bad a.outSo now using the shell script above, we can hunt for how large we can make the array a[] before we get this error message. Currently at ARSC we get:
1541000 Memory size is 14061976 bytes total static memory = 12328000 bytes on 1541000 ints 1542000 Memory size is 14069976 bytes mppexec: application too large or bad a.outSo a user could have an array of 12.3 MB (and little else) on a 16MB PE at ARSC. Now if we can estimate that if the the total 16MB is just the sum of the O.S. and the user's program then the O.S. has a size of about 3.5MB. This isn't as useful as just saying that on a single PE, about 12MB of memory is available to the user.
This "experimental" method has the advantage of being uptodate as soon as it is run. With a change to the T3D MAX operating system, the size of the operating system probably increases and the users can immediately know by how much by comparing experimental runs of before and after the change.
There are lots of other questions that can be answered with estimates like this and we will try these in the next few newsletters.
The mppldr will issue an error when too large a program is linked for the T3D. Here is an example from my LAPACK efforts:
/mpp/bin/mppldr -f zeros aladhd.o alaerh.o alaesm.o alahd.o alareq.o alasum.o alasvm.o chkxer.o ilaenv.o xlaenv.o xerbla.o snrm2.o cptsvx.o scnrm2.o slaord.o cchkaa.o cchkeq.o cchkgb.o cchkge.o cchkgt.o cchkhe.o cchkhp.o cchklq.o cchkpb.o cchkpo.o cchkpp.o cchkpt.o cchkql.o cchkqp.o cchkqr.o cchkrq.o cchksp.o cchksy.o cchktb.o cchktp.o cchktr.o cchktz.o cdrvgb.o cdrvge.o cdrvgt.o cdrvhe.o cdrvhp.o cdrvls.o cdrvpb.o cdrvpo.o cdrvpp.o cdrvpt.o cdrvsp.o cdrvsy.o cerrge.o cerrgt.o cerrhe.o cerrlq.o cerrls.o cerrpo.o cerrql.o cerrqp.o cerrqr.o cerrrq.o cerrsy.o cerrtr.o cerrtz.o cerrvx.o cgbt01.o cgbt02.o cgbt05.o cgelqs.o cgeqls.o cgeqrs.o cgerqs.o cget01.o cget02.o cget03.o cget04.o cget07.o cgtt01.o cgtt02.o cgtt05.o chet01.o chpt01.o claptm.o clarhs.o clatb4.o clatsp.o clatsy.o clattb.o clattp.o clattr.o clavhe.o clavhp.o clavsp.o clavsy.o clqt01.o clqt02.o clqt03.o cpbt01.o cpbt02.o cpbt05.o cpot01.o cpot02.o cpot03.o cpot05.o cppt01.o cppt02.o cppt03.o cppt05.o cptt01.o cptt02.o cptt05.o cqlt01.o cqlt02.o cqlt03.o cqpt01.o cqrt01.o cqrt02.o cqrt03.o cqrt11.o cqrt12.o cqrt13.o cqrt14.o cqrt15.o cqrt16.o cqrt17.o crqt01.o crqt02.o crqt03.o cspt01.o csbmv.o cspt02.o cspt03.o csyt01.o csyt02.o csyt03.o ctbt02.o ctbt03.o ctbt05.o ctbt06.o ctpt01.o ctpt02.o ctpt03.o ctpt05.o ctpt06.o ctrt01.o ctrt02.o ctrt03.o ctrt05.o ctrt06.o ctzt01.o ctzt02.o sget06.o ../../tmglib.a ../../lapack.a -o ../xlintstc ... mppldr-331 mppldr: WARNING DEX expression 9 module 'ALADHD' calculated a relative branch target too distant. The last symbol referenced was '_fcd_blank'. ...Then using the "explain" facility we get the following explanation:
explain mppldr331 DEX expression _n in module '_n_a_m_e' calculated a relative branch target too distant.... Your program is too large. At this time there is no specific solution. Contact your system support staff.If you get this message and you contact me, as the T3D consultant, I'll ask you if you can make the program smaller, because I don't know any other fix! And I don't believe there is any other fix.
Next week we'll explore more the limits on code and data for the T3D.
Current Editors:
E-mail Subscriptions:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
-
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
-
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
