ARSC T3D Users' Newsletter 4, September 16, 1994
Data Size Types
The combination of the Cray-Y-MP and Cray-T3D has to be a textbook case for 'heterogeneous' computing. And even though both machines are based on '64 bit processors' the differences in data types are very 'hetero'. Here are some differences I know of, maybe users can send in their own:
- C data types
- Fortran data types
- Floating point formats
C Data Types
Rather than search the manuals its sometimes easier to make up a test case and just run it. Recently I've tried out:
main()
{
short tshort;
int tint;
long tlong;
float tfloat;
double tdouble;
tshort = tint = tlong = 1;
tfloat = tdouble = 1.0;
printf( "sizeof(tshort) is \t%d bytes\n", sizeof( tshort ) );
printf( "sizeof(tint) is \t%d bytes\n", sizeof( tint ) );
printf( "sizeof(tlong) is \t%d bytes\n", sizeof( tlong ) );
printf( "sizeof(tfloat) is \t%d bytes\n", sizeof( tfloat ) );
printf( "sizeof(tdouble) is \t%d bytes\n", sizeof( tdouble ) );
}
On the Y-MP we get:
sizeof(tshort) is 8 bytes sizeof(tint) is 8 bytes sizeof(tlong) is 8 bytes sizeof(tfloat) is 8 bytes sizeof(tdouble) is 8 bytesOn the T3D we get:
sizeof(tshort) is 4 bytes sizeof(tint) is 8 bytes sizeof(tlong) is 8 bytes sizeof(tfloat) is 4 bytes sizeof(tdouble) is 8 bytesThe two data types that differ, short and float, have to be handled differently on each machine. One of our users had a C calling Fortran program that worked on the M98 but failed on the T3D. The problem was traced to a float on the C side was being passed to the Fortran side and the Fortran subroutine just naturally expected a 64 bit quantity. Copying the float to a temporary double in C and passing that variable solved the problem.
(Looking at the C environment on Y-MP can take the wind out of any assumptions a C programmer might have of data type sizes.)
Fortran Data Types
Again, I used a program to interrogate the machine itself. On the T3D I used:
c real*2 r2 ! not allowed
real*4 r4
real*8 r8
c real*16 r16 ! not yet implemented
c double precision d ! not yet implemented
real r
r4 = 2.0
r8 = 2.0
r = 2.0
do 10 i = 1, 1026
r4 = r4 / 2
r8 = r8 / 2
r = r / 2
print *, i, r4, r8, r
10 continue
end
The commented out statements are the result of compiler generated messages. The output looks like:
1, 1., 1., 1. 2, 0.5, 0.5, 0.5 3, 0.25, 0.25, 0.25 4, 0.125, 0.125, 0.125 5, 6.25E-2, 6.25E-2, 6.25E-2 6, 3.125E-2, 3.125E-2, 3.125E-2 7, 1.5625E-2, 1.5625E-2, 1.5625E-2 8, 7.8125E-3, 7.8125E-3, 7.8125E-3 9, 3.90625E-3, 3.90625E-3, 3.90625E-3 10, 1.953125E-3, 1.953125E-3, 1.953125E-3 ... 1022, 4.45014771701440277E-308, 4.45014771701440277E-308, 4.45014771701440277E-308 1023, 2.22507385850720138E-308, 2.22507385850720138E-308, 2.22507385850720138E-308 1024, 0., 0., 0.It is surprising that real*4 is accepted by the compiler without a warning message, but then implemented the same as real*8 and real. An independent check confirmed that real*4 uses as much storage as real (i.e., 64 bits).
On the M98, the final test program looked like:
c real*2 r2 ! not allowed
real*4 r4
real*8 r8
real*16 r16
double precision d
real r
r4 = 2.0
r8 = 2.0
r = 2.0
r16 = 2.d0
d = 2.d0
do 10 i = 1, 30000
r4 = r4 / 2
r8 = r8 / 2
r = r / 2
r16 = r16 / 2
d = d / 2
print *, i, r4, r8, r, r16, d
10 continue
end
and the sample output like:
1, 3*1., 2*1.E+0 2, 3*0.5, 2*5.E-1 3, 3*0.25, 2*2.5E-1 4, 3*0.125, 2*1.25E-1 5, 3*6.25E-2, 2*6.25E-2 6, 3*3.125E-2, 2*3.125E-2 7, 3*1.5625E-2, 2*1.5625E-2 8, 3*7.8125E-3, 2*7.8125E-3 9, 3*3.90625E-3, 2*3.90625E-3 10, 3*1.953125E-3, 2*1.953125E-3 ... 8189, 3*1.4668830940439E-2465, 2*1.4668830940438777324971299136E-2465 8190, 3*7.3344154702194E-2466, 2*7.3344154702193886624856495682E-2466 8191, 3*0., 2*0.So on the Y-MP both the 64-bit and 128-bit reals are implemented and the real*4 is implemented in 64 bits.
Floating Point Formats
From the behavior of the loops in the above test programs we see that the floating point formats are different. In future newsletters I'll go into the details of these differences.LAPACK on the T3D
A user of the ARSC T3D asked about the status of LAPACK on the T3D. This was because not all of the LAPACK routines are library in /mpp/lib/libsci.a. A guess can be made of which routines are in /mpp/lib/libsci.a by doing:ar -t /mpp/lib/libsci.a sortComparing this list to the list of routines in the public domain version of lapack (source available from anonymous ftp from netlib2.cs.utk.edu) we get an explicit list of missing routines from /mpp/lib/libsci.a that are in the public domain version:
cbdsqr.o cgbtf2.o cgebak.o cgebal.o cgebd2.o cgebrd.o cgees.o cgeesx cgeev.o cgeevx.o cgegs.o cgegv.o cgehd2.o cgehrd.o cgelq2.o cgels.o cgelss.o cgelsx.o cgeql2.o cgeqpf.o cgeqr2.o cgerq2.o cgesvd.o cgetf2.o cggbak.o cggbal.o cggglm.o cgghrd.o cgglse.o cggqrf.o cggrqf.o cggsvd.o cggsvp.o chbev.o chbevx.o chbtrd.o cheev.o cheevx.o chegs2.o chegst.o chegv.o chetd2.o chetf2.o chetrd.o chgeqz.o chpev.o chpevx.o chpgst.o chpgv.o chptrd.o chsein.o chseqr.o clabrd.o clacgv.o clacon.o clacpy.o clacrt.o cladiv.o claein.o claesy.o claev2.o clags2.o clagtm.o clahef.o clahqr.o clahrd.o claic1.o clangb.o clange.o clangt.o clanhb.o clanhe.o clanhp.o clanhs.o clanht.o clansb.o clansp.o clansy.o clantb.o clantp.o clantr.o clapll.o clapmt.o claqgb.o claqge.o claqsb.o claqsp.o claqsy.o clar2v.o clarfx.o clargv.o clartg.o clartv.o clascl.o claset.o clasr.o classq.o claswp.o clasyf.o clatbs.o clatps.o clatrd.o clatrs.o clatzm.o clauu2.o clauum.o clazro.o cpbtf2.o cpotf2.o cpteqr.o csrscl.o cstein.o csteqr.o csytf2.o ctgevc.o ctgsja.o ctrevc.o ctrexc.o ctrsen.o ctrsna.o ctrsyl.o ctrti2.o ctzrqf.o cung2l.o cung2r.o cungbr.o cunghr.o cungl2.o cungr2.o cungtr.o cunm2l.o cunm2r.o cunmbr.o cunmhr.o cunml2.o cunmr2.o cunmtr.o cupgtr.o cupmtr.o icmax1.o lsamen.o sbdsqr.o scsum1.o sgbtf2.o sgebak.o sgebal.o sgebd2.o sgebrd.o sgees.o sgeesx.o sgeev.o sgeevx.o sgegs.o sgegv.o sgehd2.o sgehrd.o sgelq2.o isgels.o sgelss.o sgelsx.o sgeql2.o sgeqpf.o sgeqr2.o sgerq2.o sgesvd.o sgetf2.o sggbak.o sggbal.o sggglm.o sgghrd.o sgglse.o sggqrf.o sggrqf.o sggsvd.o sggsvp.o shgeqz.o shsein.o shseqr.o slabad.o slabrd.o slacon.o slacpy.o sladiv.o slae2.o slaebz.o slaein.o slaev2.o slaexc.o slag2.o slags2.o slagtf.o slagtm.o slagts.o slahqr.o slahrd.o slaic1.o slaln2.o slangb.o slange.o slangt.o slanhs.o slansb.o slansp.o slanst.o slansy.o slantb.o slantp.o slantr.o slanv2.o slapll.o slapmt.o slapy2.o slapy3.o slaqgb.o slaqge.o slaqsb.o slaqsp.o slaqsy.o slaqtr.o slar2v.o slarfx.o slargv.o slartg.o slartv.o slaruv.o slas2.o slascl.o slaset.o slasr.o slassq.o slasv2.o slaswp.o slasy2.o slasyf.o slatbs.o slatps.o slatrd.o slatrs.o slatzm.o slauu2.o slauum.o slazro.o sopgtr.o sopmtr.o sorg2l.o sorg2r.o sorgbr.o sorghr.o sorgl2.o sorgr2.o sorgtr.o sorm2l.o sorm2r.o sormbr.o sormhr.o sorml2.o sormr2.o sormtr.o spbtf2.o spotf2.o spteqr.o srscl.o ssbev.o ssbevx.o ssbtrd.o sspev.o sspevx.o sspgst.o sspgv.o ssptrd.o sstebz.o sstein.o ssteqr.o ssterf.o sstev.o sstevx.o ssyev.o ssyevx.o ssygs2.o ssygst.o ssygv.o ssytd2.o ssytf2.o ssytrd.o stgevc.o stgsja.o strevc.o strexc.o strsen.o strsna.o strsyl.o strti2.o stzrqf.o(On denali, the /lib/libsci.a is not in a format that allows "ar" to extract the deck names.)
To provide a work-around for this situation, I compiled the public domain sources of lapack and have placed a T3D version of the lapack library in:
/user/local/examples/mpp/lib/lapack.aI am running the extensive testing and timing programs provided in the public domain distribution and the results are correct. ...
Users can now link their program as follows:
mppldr ... /mpp/lib/libsci.a /user/local/examples/mpp/lib/lapack.aThis ensures that the available libsci lapack routines are used before the public domain lapack routines. Only the single precision (64 bit) and complex versions are available because the cf77 compiler doesn't support double precision or double complex.
>>> An Error in the Public Domain Source for LAPACK <<<
The /mpp/bin/cf77 compiler discovered an error in the routines SLAGS2 and CLAGS2 that was easy to correct. The cf77 and cft77 compilers by default flag uses of a variable that has not been set. (Neither cf77 or cft77 on the M98 detected these errors.) I added the obviously missing lines and added the corrected versions for the T3D lapack.a.These routines are in the list of Lapack routines missing from /mpp/lib/libsci.a.
Next week we'll finish the calling C from Fortran series and start looking at program sizes.
Current Editors:
E-mail Subscriptions:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
-
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
-
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
