ARSC T3D Users' Newsletter 22, February 13, 1995
ARSC T3D Upgrades
In the next month we will be upgrading the T3D Programming Environment (libraries, tools and compilers) from P.E. 1.1 to P.E. 1.2,
Users will be notified when these upgrades will happen in mailings to the ARSC T3D user's group (i.e., those who receive this newsletter).
Upgrade to the T3D Memory
On February 7th, ARSC upgraded the memory on each PE from 2MWs to 8MWs. If any users have questions about this, please contact Mike Ess.Upgrade to MAX 1.2
On January 31st, ARSC upgraded to the 1.2 version of MAX, the T3D operating system. If any users notice differences in their codes running in the T3D they should notify Mike Ess.What's in the Libraries
In the directory /mpp/bin there is a mpp utility call nm that prints the object files that are in a library. It performs the same functions as nm in /bin/nm does for Y-MP libraries and those functions are described in the man page on denali.In going from P.E. 1.1 to P.E. 1.2 this utility provides a quick method to see what's new in the library. First we run:
/mpp/bin/nm /mpp/lib/lib*.a grep lib > pe11.objsthen after ARSC has upgraded to P.E. 1.2 we run:
/mpp/bin/nm /mpp/lib/lib*.a grep lib > pe12.objsNow a diff between pe11.objs and pe12.objs will show us what's been added in the new release.
More Speed
Once a T3D application is up and running we always need more speed. One place to look for increased speed is in the libsci routines. Libsci has single PE versions of all of the BLAS 1, 2, and 3 routines and and many of LAPACK routines. However before we add them to our code expecting a speed improvement, we should time the code they replace and the routine itself. It may be that calling the routine in libsci is actually slower than the code it replaces. For example, the overhead of the subroutine call to libsci could be as significant as the function performed.Here is a small program I've used to see if the code replacement with a libsci will help:
real a( 2000 ), b( 2000 )
integer index( 16 )
data index / 0, 1, 2, 3, 4, 5, 10, 20, 40, 50, 100, 200, 400,
+ 500, 1000, 2000 /
do 10 i = 1, 2000
a( i ) = i
b( i ) = i
10 continue
do 100 i = 1, 16
t1 = second()
s1 = sdot( index( i ), a, 1, b, 1 ) ! libsci replacement
t2 = second()
s2 = 0.0
do 20 j = 1, index( i )
s2 = s2 + a( j ) * b( j ) ! code replaced
20 continue
t3 = second()
s3 = snrm2( index( i ), a, 1 ) ! libsci replacement
t4 = second()
s4 = 0.0
do 30 j = 1, index( i )
s4 = s4 + a( j ) * a( j ) ! code replaced
30 continue
s4 = sqrt( s4 )
t5 = second()
if( s1 .ne. s2 ) then
write( 6, 600 ) s1, s2 ! same answers ?
stop
endif
if( s3 .ne. s4 ) then
write( 6, 601 ) s3, s4 ! same answers ?
stop
else
write( 6, 602 ) i, index( i ), t2-t1,t3-t2,t4-t3,t5-t4
endif
100 continue
600 format( " Error in sdot , found ", f10.2, " should be ", f10.2 )
601 format( " Error in snrm2, found ", f10.2, " should be ", f10.2 )
602 format( I3, i8, 4f10.6 )
end
real function second( )
second = dble( irtc( ) ) / 150000000.0
end
The results for the above program are:
1 0 0.000009 0.000001 0.000004 0.000007 2 1 0.000010 0.000002 0.000017 0.000006 3 2 0.000011 0.000002 0.000017 0.000007 4 3 0.000011 0.000002 0.000018 0.000007 5 4 0.000011 0.000002 0.000017 0.000007 6 5 0.000011 0.000002 0.000017 0.000007 7 10 0.000012 0.000003 0.000020 0.000008 8 20 0.000015 0.000004 0.000023 0.000009 9 40 0.000020 0.000006 0.000027 0.000011 10 50 0.000020 0.000007 0.000030 0.000011 11 100 0.000028 0.000020 0.000038 0.000015 12 200 0.000031 0.000035 0.000055 0.000023 13 400 0.000043 0.000067 0.000086 0.000038 14 500 0.000048 0.000083 0.000102 0.000046 15 1000 0.000079 0.000163 0.000184 0.000084 16 2000 0.000138 0.000323 0.000426 0.000231In this program we are investigating replacing the code for computing the dot product of two vectors and the Euclidean 2 norm of a vector with calls to libsci routines sdot and snrm2 (sdot and snrm2 are described in man pages on denali). From the times above it looks like replacing the Fortran code for a dot product with the libsci routine doesn't pay until the vectors are greater than 100 or so. For snrm2 is doesn't look like using the libsci version will ever payoff. With these small test cases it's easier to decide which libsci routines will improve the speed of a T3D program.
Reminders
List of Differences Between T3D and Y-MP
The current list of differences between the T3D and the Y-MP is:- Data type sizes are not the same (Newsletter #5)
- Uninitialized variables are different (Newsletter #6)
- The effect of the -a static compiler switch (Newsletter #7)
- There is no GETENV on the T3D (Newsletter #8)
- Missing routine SMACH on T3D (Newsletter #9)
- Different Arithmetics (Newsletter #9)
- Different clock granularities for gettimeofday (Newsletter #11)
- Restrictions on record length for direct I/O files (Newsletter #19)
- Implied DO loop is not "vectorized" on the T3D (Newsletter #20)
Current Editors:
E-mail Subscriptions:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
-
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
-
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
