ARSC T3D Users' Newsletter 105, September 20, 1996
Test of KAI C++ Compiler
[ One of our Research Assistants, Shawn Houston, installed and evaluated the KAI T3D C++ Compiler on ARSC systems. Here's a summary of his report. Send me email, and I'll provide the test programs and makefiles.]
The KAI C++ compiler is actually a set of optimization tools and a front end to the Cray C compiler. The compiler runs under a script, KCC, that controls the compilation process. The script preprocesses, compiles, and links the source files. Next, it decompiles the linked program, and reprocesses them, then relinks. I used the latest release of the compiler, version 2.9, which is not quite a complete implementation of the C++ working paper. I assume that KAI version 3.0 will be a complete implementation.
Uniprocessor Test:
The KAI compiler comes with a test program, "caxpy.C," that performs some complex valued arithmetic using vectors and times the operations. I compiled the code with both the KAI compiler and the Cray C++ compilers. I ran the programs produced with identical parameters on one PE.
Results:
The KAI executable for this one vector-oriented code is faster but the KAI executable is considerably larger.
KAI executable: 622632 bytes
Cray executable: 228568 bytes
Kai caxpy output times: "caxpy 1000 1000"
Time for caxpy1 = 0.379962 seconds [21.0547 Mflops]
Time for caxpy2 = 0.319968 seconds [25.0025 Mflops]
Time for caxpy3 = 0.440056 seconds [18.1795 Mflops]
Cray CC output times: "caxpy 1000 1000"
Time for caxpy1 = 0.39996 seconds [20.002 Mflops]
Time for caxpy2 = 0.39996 seconds [20.002 Mflops]
Time for caxpy3 = 1.40006 seconds [5.71404 Mflops]
Multiprocessor Test:
How does the KAI compiler fare in a multiprocessor program? I do not know. I obtained a copy of a small test program from Tom Baring and converted it from C to C++. I compiled it using the Cray CC compiler, and ran the program on 2 and 4 pes, getting output which agreed with that of the original C code.
After several frustrating days I could not get the KAI compiler to compile this one code. It uses the shmem libraries and some intrinsic functions. In all fairness to KAI, I may have installed the compiler wrong, or not understood how to get the KAI compiler to recognize the mpp headers and explicit functions, such as barrier.
KAI C++ Ver 3.0 News Release
Champaign, Illinois - Kuck & Associates, Inc. (KAI(TM)) announces the immediate availability of Version 3.0 of the KAI C++ compiler. This release of KAI C++ includes draft-standard C++ class libraries, support for member templates, and new usability features for large codes. KAI C++ provides the latest C++ features, runs on every major UNIX workstation and the Cray T3D, and delivers significant computational performance enhancements.
KAI C++ Version 3.0 replaces KAI earlier C++ compiler and now includes the established KAI trademark as a part of the name to ensure that KAI C++ is not confused with any other software presently on the market.
New Features & Improvements
KAI C++ Version 3.0 contains the following new features and improvements:
- Near draft-standard C++ class libraries, not just STL and iostreams
- Support for member templates
- Optimization of template expressions
- Building of libraries that contain template instantiations
Programmer Productivity
The powerful features of KAI C++ make programmers more efficient. The compiler advanced optimizations allow programmers to take full advantage of object-oriented design and software reuse without worrying about performance. KAI C++ makes objects almost as efficient as hand-coded C. Programmers will spend less time trying to correct performance problems, and instead deliver more code that is intuitive and easy to maintain.
Bruce Leasure, Vice President of Technology and senior product manager for KAI C++,emphasizes that KAI C++ will enable programmers to port code to all of the supported platforms without having to re-write it to conform to a different compiler implementation of C++ features. This makes KAI C++ essential for anyone who wants fast, cross-platform portability.
Updates & New Floating Licenses
Customers can immediately update any earlier versions to KAI C++ Version 3.0. There is no charge for customers who have an existing support service agreement or for customers who purchased Version 2 after April 1, 1996. Also, KAI C++ is now available with a Floating License on the SPARC Solaris and IBM AIX systems. Please visit the KAI C++ web page for additional information: http://www.kai.com/C_plus_plus/index.html .
Supported Platforms
KAI C++ is the only high-performance C++ compiler that developers can use across all of these development and production systems: Digital Alpha UNIX, HP 9000 UX, IBM RS/6000 AIX, SGI Irix (32 and 64 bit), SPARC-based Solaris 2 and Cray T3D.
About Kuck & Associates, Inc.
Kuck & Associates, Inc. (KAI) is internationally known for its leading-edge optimization software. This software enables developers of C/C++ and FORTRAN programs to exploit the high performance of a broad spectrum of advance computer architectures. Customers include most of the prominent U.S. computer manufacturers and many compiler companies. These companies either offer our products directly or incorporate our products into their own to bring the latest optimization technology to their end-users.
KAI optimization products are available for personal computers, workstations and supercomputers. Founded in 1979, KAI employs about 35 computer science professionals.
Copyright 1995-1996 by Kuck & Associates, Inc. All rights reserved. KAI and KAI C++ are trademarks of Kuck & Associates, Inc.
Barriers
I thought I'd try timing the barrier function using essentially the same program given last week for eurekas. It won't obtain the actual time for barrier release, as it makes extra calls needed for testing eurekas, but is interesting for comparing eurekas and barriers. A couple of notes about this exercise:
-
In addition to the well-known BARRIER subroutine, CRI provides finer control over this means of synchronizing your code, via the functions:
SET_BARRIER() - Registers the arrival of a task at a barrier WAIT_BARRIER() - Suspends task execution until all tasks arrive at the barrier TEST_BARRIER() - Tests a barrier to determine its state (set or cleared)The generic BARRIER() call is simply a call to SET_BARRIER() followed immediately by a call to WAIT_BARRIER():BARRIER() - Registers the arrival of a task at a barrier and suspends task execution until all other tasks arrive at the barrierUsing SET, TEST, and WAIT, you might be able to program some PEs to continue doing useful work while waiting for the remaining PEs to reach the barrier:call SET_BARRIER() do while (.NOT. TEST_BARRIER()) do_work () enddoThere are 'man' pages for these functions. -
My test program, given below, hangs a quarter to half the time. I think the problem is that the barrier calls are too close together, and once in a while a "set" goes undetected at some process' "wait." The traceback (after killing the job) indicates that the job did hang in a barrier call:
- Beginning of Traceback (PE 7):
- Started from address 0x20000c0804 in routine '_sma_deadlock_wait'. Called from line 78 (address 0x20000c0a20) in routine 'barrier'. Called from line 77 (address 0x20000007e0) in routine 'BARRIER_TIMINGS'. Called from line 363 (address 0x2000004628) in routine '$START$'.
- End of Traceback.
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Program barrier_timings
implicit none
integer trigger_PE ! Which PE will now trigger event
integer mc(128) ! Array to store system info
integer MY_PE ! Intrinsic function to get PE number
integer mem_event ! Shared variable for memory-mode event
real t1 ! Temporary storage of start times
real t2 ! Temporary storage of end times
real junk
real delay_start ! For simulated work, start of spin
real irtc ! Internal function, clock ticks
real cp ! Clock period in secs
logical test_event ! Internal function
logical test_barrier ! Internal function
intrinsic MY_PE
cdir$ shared mem_event
call gethmc (mc)
cp = mc(7) * 1.0e-12 ! convert picosecs to secs.
c
c Time event propagation when using eureka-mode events
c
if (MY_PE() .EQ. N$PES-1) then
write (6,1000) "EUREKA-MODE "
call flush (6)
endif
do trigger_PE = 0, N$PES - 1
call clear_event () ! In Eureka mode, all PEs must clear
call barrier () ! Make sure all PEs ready to watch for event
if (MY_PE() .EQ. trigger_PE) then
! Kill .1 secs to simulate some work
delay_start = irtc ()
5 if (irtc() .LT. delay_start + 0.1 / cp) goto 5
t1 = irtc ()
call set_event () ! Trigger event
call barrier () ! Wait till all PEs detect event
t2 = irtc ()
write (6, 1010) MY_PE(), (t2-t1) * cp * 1e6
call flush (6)
else
10 if (.NOT. test_event()) goto 10
! Inform triggering PE that 1st barrier release was detected
call set_barrier ()
endif
enddo
c
c Now use barrier
c
if (MY_PE() .EQ. N$PES-1) then
write (6,1000) "BARRIER"
call flush (6)
endif
do trigger_PE = 0, N$PES - 1
call barrier ()
if (MY_PE() .EQ. trigger_PE) then
delay_start = irtc ()
105 if (irtc() .LT. delay_start + 0.1 / cp) goto 105
t1 = irtc ()
call set_barrier () ! Trigger release of barrier
call barrier () ! Wait till all PEs detect release
t2 = irtc ()
write (6, 1010) MY_PE(), (t2-t1) * cp * 1e6
call flush (6)
else
call set_barrier () ! All non-trigger PEs pass barrier
! Spin until trigger PE does its set_barrier
110 if (.NOT. test_barrier()) goto 110
! Inform triggering PE that 1st barrier release was detected
call set_barrier ()
endif
enddo
1000 format (a,/,"Event_PE ", " Delay(usecs)")
1010 format (i4, " ", f6.2)
end
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Output from one run on 8 PEs:
EUREKA-MODE Event_PE Delay(usecs) 0 10.04 1 9.46 2 9.55 3 10.17 4 9.24 5 9.68 6 11.28 7 9.73 BARRIER Event_PE Delay(usecs) 0 6.48 1 6.37 2 6.21 3 6.29 4 6.30 5 6.44 6 6.37 7 6.53
Quick-Tip Q & A
A: {{ In ftp, can you "more" a remote file -- before you "get" it? How? }}
# Thanks for reader response.
# To "more" the remote file, 'tst.remote', use either of:
ftp> get tst.remote -
ftp> get tst.remote "
more"
# (Bonus answer!) To "more" the local file, 'tst.local':
ftp> ! more tst.local
Q: If you must look at computers all day (for years), how can you
reduce eye-strain?
[ Answers, questions, and tips graciously accepted. ]
Current Editors:
E-mail Subscriptions:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
-
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
-
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
