ARSC T3E Users' Newsletter 152, October 2, 1998

Co-Array Fortran Available to ARSC Users

SGI/Cray has released co-array Fortran (formerly known as F--) in its cf90 version 3.1 compiler.

To allow our users to experiment with this refreshingly sensible approach to parallelizing codes, ARSC has installed the complete Programming Environment 3.1 release on yukon (although PE 3.0.2 remains the default). To access PE 3.1, execute the command:

module switch PrgEnv PrgEnv.test

In PE 3.1, fortran codes including co-arrays are compiled by including the -Z option. For instance:

f90 -Z -o test test.f

In PE 3.1, co-arrays appear in the "man" pages:

  • f90
  • this_image
  • num_images
  • sync_images
  • rem_images
  • log2_images

The complete CF90 Co-array Programming Manual is available to ARSC users via our dynaweb server, at:

http://www.arsc.edu:40/

In the "titles" table of contents, look under "C." Or, you may go straight to the manual, at:

http://www.arsc.edu:40/library/all/004-3908-001

(The login and password to ARSC's dynaweb server is available on all ARSC systems by typing, "news documents".)

Please let us know if you encounter any problems and expect more articles on Co-Array Fortran in coming issues.

"grmap" -- Formatting Tool For "grmview" Output

In newsletter 140, we included a home-grown script which produced a friendly ASCII display of grmview output--reminiscent of the T3D's utility, "mppview."

Alan Wallcraft of NRL made a few modifications to the script and sent it back with more ideas and a good name. We've done more work on it, incorporating Alan's changes and ideas, and produced "grmap," version 2, available at:

ftp://www.arsc.edu/pub/mpp/src/grmap.gz

New features:

  • Provides three different display formats
  • Chooses appropriate format for size of host
  • Computes avg and max memory in use per PE by each job
  • Summarizes overall system size and utilization
  • Reports non-operational PEs
  • Lists user login names instead of uids

Examples:

Compressed format for large machines

The fictitious T3E shown in this invented test has 401 APP PEs. Only APP PEs appear in the grmap grid. Six of the APP PEs are non-operational (as designated in the grid by "@'s").

One of the OS PEs is non-operational (as described in text above the job table). If all of the CMD and OS PEs were operational, there would be no such comment.

Two jobs are "Ap. limited" (and marked here as "BLOCKED"). The by-job memory report is in MB, the system summary memory report is in GB.


!!! Non-operational OS PE detected. PE Number:403 !!!

     UserName Size BasePE      Mem avg:max  Command       
     ======== ==== =========== ============ ==============
 A - bonnie   32   0   [0x0  ]      72:72   a.out.32                       
 B - clyde    36   32  [0x20 ]     133:146  mya.out36                         
 C - harry    16   68  [0x44 ]      26:31   youra.out                        
 D - sally    10   84  [0x54 ]      37:38   hera.out                           
 E - bonnie   70   114 [0x72 ]      94:242  junk                          
 F - sally    66   202 [0xca ]     107:242  test
 G - harold   10   284 [0x11c]      37:38   les                           
 H - maude    8    BLOCKED           0:0    sim
 I - bonnie   8    BLOCKED           0:0    mod
--------------------------------------------------------------------------
 0  
 *A.A.A.A.A.A.A.A .A.A.A.A.A.A.A.A .A.A.A.A.A.A.A.A .A.A.A.A.A.A.A.A
 32 
 *B.B.B.B.B.B.B.B .B.B.B.B.B.B.B.B .B.B.B.B.B.B.B.B .B.B.B.B.B.B.B.B 
 64 
 .B.B.B.B*C.C.C.C .C.C.C.C.C.C.C.C .C.C.C.C*D.D.D.D .D.D.D.D.D.D* . 
 96 
 . . . . *@* . .  . . . . . . . .  . . *E.E.E.E.E.E .E.E.E.E.E.E.E.E
 128
 .E.E.E.E.E.E.E.E .E.E.E.E.E.E.E.E .E.E.E.E.E.E.E.E .E.E.E.E.E.E.E.E
 160
 .E.E.E.E.E.E.E.E .E.E.E.E.E.E.E.E .E.E.E.E.E.E.E.E * . . . . . . . 
 192
 . . . . . . . .  *@* *F.F.F.F.F.F .F.F.F.F.F.F.F.F .F.F.F.F.F.F.F.F
 224
 .F.F.F.F.F.F.F.F .F.F.F.F.F.F.F.F .F.F.F.F.F.F.F.F .F.F.F.F.F.F.F.F
 256
 .F.F.F.F.F.F.F.F .F.F.F.F* . . .  . . . . . . . .  . . . . *G.G.G.G
 288
 .G.G.G.G.G.G* *@ * . . *@* . . .  . . . . . . . .  . . . . . . . . 
 320
 . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . . 
 352
 . . . . . . . .  . . . . . . . .  . . . . . . . .  . . . . . . . . 
 384
 . . . . . . . .  . . . *@* . . *@ * 
--------------------------------------------------------------------------
APP-PEs tot:used:free:down  MEM tot:used:free  JOBS run:blk
        401: 240:161 :6          98:  32:66           7:2   

Standard format

This fictitious T3E has 256 APP PEs, 2 of which are non-operational. (The table of jobs and system summary look the same for all three formats, so I've left them off.)


 --------------------------------------------------------
  0 
 <A..A..A..A..A..A..A..A. .A..A..A..A..A..A..A..A.
 16 
 .A..A..A..A..A..A..A..A. .A..A..A..A..A..A..A..A>
 32 
 <B..B..B..B..B..B..B..B. .B..B..B..B..B..B..B..B.
 48 
 .B..B..B..B..B..B..B..B. .B..B..B..B..B..B..B..B.
 64 
 .B..B..B..B><C..C..C..C. .C..C..C..C..C..C..C..C.
 80 
 .C..C..C..C><D..D..D..D. .D..D..D..D..D..D>< .. .
 96 
 . .. .. .. ><@>< .. .. . . .. .. .. .. .. .. .. .
 112
 . .. ><E..E..E..E..E..E. .E..E..E..E..E..E..E..E.
 128
 .E..E..E..E..E..E..E..E. .E..E..E..E..E..E..E..E.
 144
 .E..E..E..E..E..E..E..E. .E..E..E..E..E..E..E..E.
 160
 .E..E..E..E..E..E..E..E. .E..E..E..E..E..E..E..E.
 176
 .E..E..E..E..E..E..E..E> < .. .. .. .. .. .. .. .
 192
 . .. .. .. .. .. .. .. > <@>< .. .. .. .. .. .. .
 208
 . .. .. .. .. .. .. .. . . .. .. .. .. .. .. .. .
 224
 . .. .. .. .. .. .. .. . . .. .. .. .. .. .. .. .
 240
 . .. .. .. .. .. .. .. . . .. .. .. .. .. .. .. .
--------------------------------------------------------

Expanded format This is the ARSC T3E: 94 APP PEs, all operational.


--------------------------------------------------
 0  
    A....A....A....A....A....A....A....A..
 8  
  ..A....A....A....A....A....A....A....A..
 16 
  ..A....A....A....A....A....A....A....A..
 24 
  ..A....A....A....A....A....A....A....A  
 32 
    B....B....B....B....B....B....B....B..
 40 
  ..B....B....B....B....B....B....B....B..
 48 
  ..B....B....B....B....B....B....B....B..
 56 
  ..B....B....B....B....B....B....B....B..
 64 
  ..B....B....B....B....B....B....B....B..
 72 
  ..B....B....B....B....B....B....B....B..
 80 
  ..B....B     .... .... .... .... .... ..
 88 
  .. .... .... .... .... .... ..
--------------------------------------------------

Coming Events

There are a number of events in the coming months were you can learn more about ARSC and, in particular, the research of our users.

  1. The 49th Arctic Science Conference and International Arctic Research Center Launch will be held 25th-28th October in Fairbanks. There will be a number of presentations and posters on International Cooperation in Arctic Research, many by ARSC users. There will be daily tours of ARSC during which the Immersadesk will be used to display results of users research. More details on the meeting can be found by following the link from hot topics on the ARSC web page, http://www.arsc.edu/ .
  2. November 9th-13th sees the 10th Supercomputing conference and ARSC will be present amongst the research booths highlighting the work of its users. Staff will also be taking the opportunity to catch up on the latest developments in the ever changing world of High Performance Computing. A number of ARSC users will have work displayed in both the ARSC booth and the poster exhibits plus other locations. More information on SC98 can be found at http://www.supercomp.org/

ARSC Vector Courses Scheduled

Introduction to UNICOS and CRAY J90 Programming


Date: Wednesday, October 7, 1998
Time: 1:00pm - 5:00pm
Location: University of Alaska Fairbanks, Butrovich Building, 
  ARSC Training Room
Instructor: Derek Bastille, User Consultant

Course Description This course is an introduction to programming on Chilkoot
(ARSC's CRAY J932 supercomputer). It will cover a variety of topics on
Chilkoot's programming environment including:

   * vector processing
   * compiling, running, and debugging programs
   * submitting and monitoring batch jobs
   * scientific libraries
   * data storage & file system organization
   * the current UNICOS programming environment (PE 3.0) and modules
   * sources of documentation
   * where to get further help

Intended Audience: Current or potential ARSC users who would like to find out
how to create and maintain programs on our vector supercomputer.

     http://www.arsc.edu/user/classes/ClassJ90.html

Introduction to Vector Programming, Performance Monitoring, and Optimization


Dates: Wednesday, November 4, 1998
Time: 1:00pm - 5:00pm
Location: University of Alaska Fairbanks, Butrovich Building, 
  ARSC Training Room
Instructor: Tom Baring, User Services Consultant

Course Description: Standard C/C++ and Fortran 90 codes are readily compiled
under UNICOS compilers. However, the performance of any such code is
determined by the degree to which the compilers can vectorize it.

This course is designed for users of the ARSC J90 supercomputer who need a
basic explanation of vector programming and an introduction to the tools
available to help users analyze and improve their code's performance. We
will discuss the following topics:

   * Vectorization. What is it?
   * Automatic compiler optimization
   * Performance analysis tools: HPM, Perfview, Loopmark Listing
   * Common coding and compiling mistakes that can ruin performance

Following this course, you will be able to determine your program's
performance in MFLOP/s, for the entire program, for each subroutine, or for
any arbitrary code segment. You will be able to determine in which
subroutines your program spends most of its time and which individual loops
are using the vector architecture well or poorly. You will also have a few
tools in your kit with which to try to extract better performance from your
code.

Some students may realize impressive speed-up by simply recompiling their
code with different compiler options. Others may decide to rewrite code
segments to facilitate optimization and throughput.

Intended Audience: This course is intended for existing users of ARSC vector
platforms who want to examine and improve the performance of their code.

Quick-Tip Q & A



A: {{ The Unix "sort" command sorts multiple lines.  Can you sort words
      on a single line?  You might want, for instance, to sort the output
      of the "groups" command. }}
   

   # Thanks to Dale Clark of ARSC for this solution:

   groups 
 tr -s ' ' '\n' 
 sort 
 tr '\n' ' '



Q: The "hpm" tool for Cray PVP platforms makes it easy to measure
   my vector code's overall performance.  How can I get my T3E code's
   MFLOP/S rating?  Is there a comparable tool?

[ Answers, questions, and tips graciously accepted. ]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top