ARSC HPC Users' Newsletter 257, November 1, 2002
SX-6 Cross-Compilers Available on rimegate
SX-6 users may now compile and link codes for the SX-6 on the front-end SGI Octane2 host, rimegate. We've also installed NEC's X Windows System programming environment, "psuite". From "man psuite":
PSUITE provides an integrated GUI-based environment, which supports the whole program-development cycle consisting of editing, compiling, running, and performance tuning of the Fortran90, C and C++ program.
man pages are available on rimegate, and NEC on-line documents are available at ARSC's web site for the following products, and more:
sxf90 sxc++ sxld psuite
We have just installed the cross-compilers, and will update our regular documentation as soon as possible. Watch news items on rimegate for details.
The Size of IBM XLF Fortran Reals
[ Thanks to Kate Hedstrom of ARSC. ]
On the Cray systems, the default "real" is 8 bytes in size while the IBM "real" is 4 bytes. When compiling a code on the IBM, one might use a compile time option that promotes "real" to "real*8".
The IBM compiler happens to have several of these promotion options, so let's investigate what the differences are.
We need a code that will provide some hint about the size of the variables. This is derived from one by Andrew Vaught that was posted to the g95 list:
types.f:
real :: a double precision :: b real(kind=4) :: c real(kind=8) :: d real(kind=kind(0.0)) :: e real(kind=kind(0.0d0)) :: f a = 1.1; b = 1.1; c = 1.1; d = 1.1; e = 1.1; f = 1.1; g = 1.1 print *, a; print *, b; print *, c; print *, d; print *, e; print *, f; print *, g end
Default compilation:
% xlf90 types.f % ./a.out 1.100000024 1.10000002384185791 1.100000024 1.10000002384185791 1.100000024 1.10000002384185791 1.100000024As you can see, variables b, d, and f are double precision while the rest are single precision. All instances of the literal 1.1 are single precision.
xlf90 -qdpc[=e]:
% xlf90 -qdpc types.f % ./a.out 1.100000024 1.10000000000000009 1.100000024 1.10000000000000009 1.10000000000000009 1.10000000000000009 1.100000024
This promotes all literals, so both 1.1 and kind(0.0) become real*8. The optional "=e" allows the promotion of 1.1e8 to 1.1d8 as well although our code doesn't test this.
xlf90 -qautodbl=<setting>:
% xlf90 -qautodbl=dbl4 types.f % ./a.out 1.10000000000000009 1.10000000000000009 1.10000000000000009 1.10000000000000009 1.10000000000000009 1.10000000000000009 1.10000000000000009 % xlf90 -qautodbl=dbl8 types.f % ./a.out 1.100000024 1.1000000238418579101562500000000000 1.100000024 1.1000000238418579101562500000000000 1.100000024 1.1000000238418579101562500000000000 1.100000024 % xlf90 -qautodbl=dbl types.f % ./a.out 1.10000000000000009 1.1000000000000000888178419700125232 1.10000000000000009 1.1000000000000000888178419700125232 1.10000000000000009 1.1000000000000000888178419700125232 1.10000000000000009
This option allows the promotion of all reals to real*8, all real*8 to real*16, or both.
xlf90 -qrealsize=8:
% xlf90 -qrealsize=8 types.f % ./a.out 1.10000000000000009 1.1000000000000000888178419700125232 1.100000024 1.10000000000000009 1.10000000000000009 1.1000000000000000888178419700125232 1.10000000000000009
This option is quite interesting, promoting "real" and "double precision", but not "real*4" and "real*8". It also promotes literal constants such as 1.1.
We hope this helps you determine which option is right for you.
SELECTED_REAL_KIND, Caution for Portability
The SELECTED_REAL_KIND function is useful for writing portable codes, when you know the minimum precision and range required of variables. Using it consistently can help eliminate the need for real type promotion, as discussed in the previous article.
That said, care is still required, as a user porting a code to the Cray SV1ex discovered this week. The problem: while IEEE 8-byte reals provide 15 digits of precision, Cray 8-byte reals only provide 13.
The code didn't actually need 15 digits of precision, but had made this a requirement anyway, using "SELECTED_REAL_KIND(15,307)". To ensure 15 digits of precision, the Cray compiler was forced to use 16-byte representation for the variables in question
Cray 128-bit (16-byte) math is performed in software, and for performance, should be avoided unless necessary.
The solution was to redefine the kind parameter as follows:
"SELECTED_REAL_KIND(13,307)"
this guarantees 13 digits of precision wherever it runs, which was adequate, but yields 8-byte reals on both IEEE and Cray PVP platforms.
DETAILS:
Excerpt from "man SELECTED_REAL_KIND":
NAME
SELECTED_REAL_KIND - Returns the real kind type parameter
SYNOPSIS
SELECTED_REAL_KIND ([[P=]p] [,[R=]r])
DESCRIPTION
The SELECTED_REAL_KIND intrinsic function returns the real kind
type parameter of a real data type with decimal precision of at
least p digits and a decimal exponent range of at least r. At
least one argument must be present, and the arguments are as
follows:
p Must be scalar and of type integer.
r Must be scalar and of type integer.
Test Code:
This is based on the user code and also borrows from the previous article and the detailed article on Fortran "Range and Precision", given in issue #231
MODULE mod_kinds
implicit none
integer, parameter :: kind_13x307 = selected_real_kind(13, 307)
integer, parameter :: kind_15x307 = selected_real_kind(15, 307)
END MODULE mod_kinds
MODULE mod_scalars
USE mod_kinds
implicit none
real(kind_13x307) :: t1
real(kind_15x307) :: t2
END MODULE mod_scalars
PROGRAM test
USE mod_kinds
USE mod_scalars
implicit none
t1 = 1.1_kind_13x307
t2 = 1.1_kind_15x307
print*
print*,"selected_real_kind(13, 307): t1= ", t1
print*," kind:",kind(t1)
print*," range:",range(t1)
print*," precision:",precision(t1)
print*
print*,"selected_real_kind(15, 307): t2= ", t2
print*," kind:",kind(t2)
print*," range:",range(t2)
print*," precision:",precision(t2)
END PROGRAM test
Cray SV1ex Output:
selected_real_kind(13, 307): t1= 1.100000000000001 kind: 8 range: 2465 precision: 13 selected_real_kind(15, 307): t2= 1.10000000000000000000000000001E+0 kind: 16 range: 2465 precision: 28
Cray (NEC) SX-6
selected_real_kind(13, 307): t1= 1.100000000000000 kind: 8 range: 307 precision: 15 selected_real_kind(15, 307): t2= 1.100000000000000 kind: 8 range: 307 precision: 15
IBM p690:
selected_real_kind(13, 307): t1= 1.10000000000000009 kind: 8 range: 307 precision: 15 selected_real_kind(15, 307): t2= 1.10000000000000009 kind: 8 range: 307 precision: 15
Again, the conclusion is that you should determine and specify the precision and range required, and not simply default to 15 and 307.
PE 3.6 Available for Testing on Crays
Programming Environment 3.6 (PE 3.6) has been installed on both yukon and chilkoot. It is available as PrgEnv.new, and we encourage users to test their codes with it and let us know of changes in performance or if problems are discovered.
Execute this command prior to compiling your code, to test PE3.6:
module switch PrgEnv PrgEnv.new
This release of craylibs fixes the problem in CTRSM noted in issue issue #243 . It changes the behavior of cpp slightly, as noted in the following article.
What follows are salient features of the new PE, as taken from Cray's release notes. Enjoy!
######################################################################
- 2.1.1. C Interoperability
-
C interoperability allows C programs to share functions and global variables that have external linkage with Fortran programs; likewise, Fortran programs can share like objects with C programs. More importantly, C interoperability provides a standard portable interoperability mechanism between Fortran and C programs. Refer to Fortran Language Reference Manual, Volume 2 for detailed information about C interoperability.
- 2.2.1. Optimization Enhancements
-
New optimization features for inlining and cloning take advantage of constant actual arguments. Another optimization feature lets the user select the file containing the code to inline.
The new inlining feature attempts to inline functions at call sites that contain constant actual arguments. This functionality is assigned to -O inline4. Therefore, aggressive inlining is now activated by -O inline5.
- 2.2.3. New Compiler Command
-
The ftn command is the new default Cray Fortran Compiler command on the Cray SV1 series and Cray T3E systems and has the same functionality as the f90 command. (The f90 command remains available.) Refer to the Cray Fortran Compiler Commands and Directives Reference Manual or the ftn(1) or ftn(1m) man page for more information.
The command name f90 was changed because it no longer accurately reflects the implemented Fortran standards.
- 2.3.1. Support of C++98 Standard
-
The Cray Standard C++ compiler now supports the C++98 standard (ISO/IEC FDIS 14882:1998). Few features of the C++ standard are currently not supported by the Cray Standard C++ compiler. These are identified in the Cray Standard C/C++ Reference Manual.
- 2.3.2. Support of C99 Standard
-
The Cray Standard C compiler now supports the C99 standard (ISO/IEC 9899:1999). The compiler also continues to support language features that were extensions prior to the adoption of the C99 standard including the complex type, VLAs, the restrict keyword, and hexadecimal floating point constants. The behavior of these features remains unchanged. Some language features are defined differently in the C99 standard than in the Cray extension. The C99 standard defined behavior is available using the -h c99 command line option, while preserving the previous behavior. Refer to Section 2.4.2 for more information about this option.
- 2.4. Cray Standard C/C++ Compiler Enhancements
-
Changes to the Cray Standard C/C++ compiler include enhancements for dealing with the standard template library (STL), conforming to the C99 standard, and displaying currently used optimization options.
Refer to Cray Standard C/C++ Reference Manual or the CC(1) man page for more information.
- 2.4.1. New Compiler Options for the Cray C++ Standard Template Library
-
C++ code that defined templates using nonstandard STL from the Cray C++ STL can now compile successfully when you use the -h conform compiler option with new compiler options -h [no]parse_templates and -h [no]dep_name.
A new compiler option, -h one_instantiation_per_object, allows you to instantiate each template referenced in the source into its own object file rather than merging them together into one file. Another new compiler option, -h instantiation_dir, allows you to select the directory to contain these files.
- 2.5. Data Transfer Enhancements for Cray SV1ex Systems
-
Fortran and C/C++ programs developed on Cray SV1ex systems can now use the new bte_move intrinsic command for faster data transfers (about 50 times faster).
- 2.6. Vector Optimization (UNICOS Systems Only)
-
A new infinite vector length optimization feature was added to the Cray Fortran Compiler and Cray Standard C/C++ compilers.
For the Cray Fortran Programming Environment, the vector optimization feature consists of a new INFINITEVL clause for the IVDEP directive and a new -O [no]infinitevl compiler option. The INFINITEVL specifies an infinite safe vector length. That is, no dependency will occur at any vector length.
- 3.3. ftnlx Replaces ftnlint and ftnlist
(Not) Using cpp to Pre-Process Fortran Code
As noted in Newsletter #173 , you're ill-advised to use cpp to pre-process Fortran code. Again, we recommend you use Fortran pre-processors on Fortran code. This issue reappeared because there's a behavior change in Cray's cpp between PE3.5 and PE3.6.
The Fortran pre-processor syntax is unfortunately different in the various Fortran compilers. Check the man pages.
As an example of the problem discovered recently, here's a bit of valid Fortran code ("//" is Fortran's string concatenation operator):
file: j.F
---------
if (comm(1:1) .ne. ' ' ) then
tstring=mswopt(1:ilen(mswopt))//',cmnt="'//
1 comm(1:ilen(comm))//'"'
mswopt=tstring(1:ilen(tstring))
endif
The C99 standard allows C code to use C++ style comments, which commence, of course, with "//". Thus, a C99 conformant cpp will likely strip string concatenations if handed Fortran code. Our hunch is that Cray, in implementing C99 (see the article on PE3.6), had to update cpp.
Anyway, here's what happens to the above Fortran file using various commands on the Cray. Note that the behavior has changed in Programming Environment 3.6.
On the Crays: =================================
First, the correct output: using the Fortran pre-processor. The "-eP" option specifies pre-processing, only. Output goes to "<source-file>.i".
--- PE3.5 and PE3.6
chilkoot$ f90 -eP j.F
chilkoot$ cat j.i
# 1 "j.F"
if (comm(1:1) .ne. ' ' ) then
tstring=mswopt(1:ilen(mswopt))//',cmnt="'//
1 comm(1:ilen(comm))//'"'
mswopt=tstring(1:ilen(tstring))
endif
Now, incorrect output using cpp:
--- PE3.5 and PE3.6
chilkoot$ cpp j.F
#line 1 "j.F"
if (comm(1:1) .ne. ' ' ) then
tstring=mswopt(1:ilen(mswopt))
1 comm(1:ilen(comm))
mswopt=tstring(1:ilen(tstring))
endif
Incorrect output using PE3.6 cpp -N:
--- PE3.6
chilkoot$ cpp -N j.F
# 1 "j.F"
if (comm(1:1) .ne. ' ' ) then
tstring=mswopt(1:ilen(mswopt))
1 comm(1:ilen(comm))
mswopt=tstring(1:ilen(tstring))
endif
Correct output using the PE3.5 cpp -N:
--- PE3.5
chilkoot$ cpp -N j.F
# 1 "j.F"
if (comm(1:1) .ne. ' ' ) then
tstring=mswopt(1:ilen(mswopt))//',cmnt="'//
1 comm(1:ilen(comm))//'"'
mswopt=tstring(1:ilen(tstring))
endif
And correct output using PE3.5 or PE3.6 cpp -C -N:
--- PE3.5 and PE3.6
chilkoot$ cpp -N -C j.F
# 1 "j.F"
if (comm(1:1) .ne. ' ' ) then
tstring=mswopt(1:ilen(mswopt))//',cmnt="'//
1 comm(1:ilen(comm))//'"'
mswopt=tstring(1:ilen(tstring))
endif
Excerpt from "man cpp" on the Crays:
-N cc, CC, and cpp commands. Enables the old style (referred to as
K&R) preprocessing. Use this option if you have problems with
preprocessing (especially non-C source code).
-C CC, cc, and cpp commands. Retains all comments in the
preprocessed source code, except those on preprocessor directive
lines. By default, the preprocessor phase strips comments from
the source code. This option is useful with the cpp command or
specified in combination with the -P or -E options on the cc and
CC commands.
On the IBMs: =================================
Again, the correct output: using the Fortran pre-processor. The "-d" option specifies that intermediate files be retained. The name xlf chooses for the pre-processor output is quite interesting.
ibm_sp$ xlf90 -d j.F
[... Fortran error messages ... attempt to compile fails ...]
ibm_sp$ cat Fj.f
if (comm(1:1) .ne. ' ' ) then
tstring=mswopt(1:ilen(mswopt))//',cmnt="'//
1 comm(1:ilen(comm))//'"'
mswopt=tstring(1:ilen(tstring))
endif
Now, incorrect output using cpp:
ibm_sp$ cpp j.F
# 1 "j.F"
if (comm(1:1) .ne. ' ' ) then
tstring=mswopt(1:ilen(mswopt))
1 comm(1:ilen(comm))
mswopt=tstring(1:ilen(tstring))
endif
On the SX-6: =================================
The SX-6 f90 "-EP" option specifies that pre-processing occur, and that the output go to a file named "i.<source-file>".
sx6$ f90 -EP j.F
[... Fortran error messages ... attempt to compile fails ...]
sx6$ cat i.j.F
if (comm(1:1) .ne. ' ' ) then
tstring=mswopt(1:ilen(mswopt))//',cmnt="'//
1 comm(1:ilen(comm))//'"'
mswopt=tstring(1:ilen(tstring))
endif
On the SGIs: =================================
The SGI f90 "-E" option specifies that only pre-processing occur, and that output goes to stdout. It'd be nice if every Fortran compiler had this option, as it gives the nearest replacement for "cpp".
sgi$ f90 -E j.F
# 1 "j.F"
if (comm(1:1) .ne. ' ' ) then
tstring=mswopt(1:ilen(mswopt))//',cmnt="'//
1 comm(1:ilen(comm))//'"'
mswopt=tstring(1:ilen(tstring))
endif
Now, incorrect output from the default, gnu cpp:
sgi$ cpp j.F
# 1 "j.F"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "j.F"
if (comm(1:1) .ne. ' ' ) then
tstring=mswopt(1:ilen(mswopt))
1 comm(1:ilen(comm))
mswopt=tstring(1:ilen(tstring))
endif
Excerpt from the gnu cpp man page:
The C preprocessor is intended to be used only with C, C++, and Objective-C source code. In the past, it has been abused as a general text processor. [...] Wherever possible, you should use a preprocessor geared to the language you are writing in. Modern versions of the GNU assembler have macro facilities. Most high level programming languages have their own conditional compilation and inclusion mechanism. If all else fails, try a true general text processor, such as GNU M4.
We like this advise...
Quick-Tip Q & A
A:[[ What's an example you've experienced or seen of a "Catch 22"? In two
[[ or three sentences only, please.
###
### Thanks to our two respondents...
###
As part of the hiring process at a company I worked for, I had to take
a drug test. The testing kit contained a form that listed each step
in the testing process, and had places to sign after every step was
completed. The last step was to seal the form in a plastic pouch.
###
This is a classic. If there is no keyboard attached, the following
bootup message appears:
No keyboard present
Press F1 to continue
Q: Arrrgggghhhhh!!!! Over quota again!
Is there some easy way to locate the largest files in my account, so
I can figure out what to delete?
[[ Answers, Questions, and Tips Graciously Accepted ]]
Current Editors:
E-mail Subscriptions:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
-
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
-
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
