ARSC T3D Users' Newsletter 80, March 29, 1996

MPP NQS Queues

This week the job limit on ARSC NQS queues was changed from 2 to 3. Now a user on ARSC can have three NQS jobs running at a time. I doubt that anyone would be running three NQS T3D jobs at a time because each would have to be running in a different queue. But users may now have a combination of three jobs running with some in the T3D queues and others in the Y-MP queues.

The weights for computing the priority for jobs in the same NQS queue have also been changed:


  factor                        previous weight   current weight
  -----                         ---------------   --------------
  Y-MP memory requested                 1                1
  Y-MP CPU time requested               1                1
  Elapsed time in the queue             1                3
  Fair Share Scheduler Priority         4                1
Hopefully this will give everyone a fairer share.

Workshop on Optimized Medium Scale Parallel Programming


> In early March you received a description of the Workshop on Optimized
> Medium Scale Parallel Programming being offered at the Pittsburgh
> Supercomputing Center April 29 - May 2. We are pleased to announce that the
> following lecturers have been confirmed for the workshop:
> 
> Speaker                                         Topic
> 
> Rusty Lusk (Argonne National Laboratory)        MPI
> Bob Kuhn (Kuck Associates Inc)                  KAP and STEP
> Chris Hill (MIT)                                Application case-study
> Bill Celmaster (Digital Equipment Corporation)  HPF; Intro. to parallelism
> 
> We are limited to 20 attendees for this workshop and plan to close
> registration on April 5th.  If you have not already done so, please
> register through one of the methods described below.  The agenda is
> included at the end of the registration information.
> 
> For more information on the workshop content please contact Bill Celmaster
> at 508/493-2173 or wcelmaster@hpc.pko.dec.com
> 
> REGISTRATION INFORMATION:
> 
> Admission to this training workshop is free to the United States academic
> community.
> 
> Interested corporate and government applicants, as well as applicants from
> academic institutions outside the United States should contact  Anne Marie
> Zellner at (412)268-4960 for information on attendance fees.
> 
> The workshop will be held at Pittsburgh Supercomputing Center's site  in
> the Mellon Institute Building, located at 4400 Fifth Avenue, between the
> campuses of Carnegie Mellon University and the University of Pittsburgh.
> 
> Housing, travel, and meals are the responsibility of participants. A
> complimentary continental breakfast will be served each morning.
> 
> Our online information provides details on local accommodations and
> directions to our Center (see URL below).
> 
> HOW TO APPLY:
> 
> To apply for this workshop, please complete and return the registration
> form below by April 5, 1996 to:
> 
>               Workshop Application Committee,
>                ATTN: Anne Marie Zellner
>                Pittsburgh Supercomputing Center
>                4400 Fifth Avenue,
>                Pittsburgh, PA  15213.
> 
> You may also apply for this workshop by sending requested information via
> electronic mail to workshop@psc.edu or via fax to (412/268-5832).
> 
> All applicants will be notified of acceptance during the week of April 8, 1996
> 
> For additional online information, please visit the workshop's homepage at
> http://www.psc.edu/training/Digital/welcome.html
> 
> ==============================================================================
>                                Registration Form
>        PSC/Digital Workshop on Optimized Medium Scale Parallel Programming
>                              April 29 - May 2, 1996
> 
> Name:
> 
> Department:
> 
> Univ/Ind/Gov Affiliation:
> 
> Address:
> 
> Telephone:  W (   )               H(   )
> 
> Electronic Mail Address:
> 
> Social Security Number:
> 
> Citizenship:
> 
> Are you a PSC user (yes/no)?
> If yes, please give your PSC username:
> 
> Academic Standing (please check one):
>  F - Faculty          UG - Undergraduate                   I - Industrial
>  PD - Postdoctorate    UR - University Research Staff      GV - Government
>  GS - Graduate Student UN - University Non-Research Staff   O - Other
> 
> Please explain why you are interested in attending this workshop and what
> you hope to gain from it:
> 
> 
> Briefly describe your computing background (scalar, vector, and parallel
> programming experience; platforms; languages) and research interests:
> 
> 
> All applicants will be notified of acceptance during the week of April 8.
> 
> 
> 
> ==============================================================================
> 
>             Workshop on Optimized Medium Scale Parallel Programming
>                              April 29 - May 2, 1996
> 
>                                -- presented by --
> 
>                           Digital Equipment Corporation
>                                       and
>                         Pittsburgh Supercomputing Center
> 
> ------------------------------------------------------------------------------
>                       APPLICATION DEADLINE:  April 5, 1996
> ------------------------------------------------------------------------------
> 
> PURPOSE
> The purpose of this workshop is to introduce application developers  to the
> technology and utilization of medium-scale parallelism.  The target
> architecture for this course will be a small cluster of Digital's Symmetric
> Multiprocessor Systems based on the latest Digital Alpha microprocessors.
> Each course attendee will be invited to work with the instructors on the
> optimal parallelization of their favorite application.
> 
> AGENDA
> To ensure that participants receive quality training, this workshops will
> incorporate both lectures and extensive hands-on lab sessions. Programming
> exercises will be carefully designed to reinforce concepts and techniques
> taught in class.
> 
> PSC staff members will present introductory lectures on the PSC  computing
> environment being used for this workshop. The remainder of the lectures
> will be presented by instructors from Digital.
> 
>         Monday April 29 (PM):
> 
>                 Welcome
>                 Introduction to Medium Scale Parallelism at PSC
>                 Introduction to PSC Facilities and Course Overview
>                 Digital's High Performance Hardware and Software
> 
>         Tuesday April 30 (AM and PM):
> 
>                 Principles of Shared and Distributed Parallelism
>                 Shared Memory Automatic Parallization Introduction
>                 Lab: (a) Web tutorials on parallel computing
>                      (b) Automatic shared parallel examples (KAP) and serial
>                          optimizations
>                 Lecture and Lab on directed shared parallelism
> 
>         Wednesday May 1 (AM and PM):
> 
>                 A case study of a real scientific application
>                 Introduction to F90 and HPF
>                 Lab on HPF; work through some examples including use of
> memory channel
>                 Lecture and Lab examples on MPI
> 
>         Thursday May 2 (AM)
> 
>                Introduction to memory channel API
>                Lab Case study on transpose; MPI, memory channel, KAP; Also HPF
> 
> Anne Marie Zellner
> Pittsburgh Supercomputing Center
> zellner@psc.edu
> (412) 268-5131

Meeting on the Optimization of Codes for the CRAY MPP Systems

This past week, I received my copy of the Proceedings of the "The Meeting of Optimization of Codes for the Cray MPP Systems". It is a collection of the presentation materials from each of the speakers. I am very happy that each and every speaker turned in their materials and that PSC has distributed this to each attendee. There is a lot of T3D specific material (1 and 3/8th inch thick) that should be of interest to T3D users. I can send copies of selected presentations to those that request them. Below is the list of presentations given:
  • Performance of a Major Oil Reservoir Simulation, Olaf Lubeck, LASL
  • Parallel Sequence Analysis, Alexander J. Ropelewski, PSC
  • Software Package for Simulation of Electro-magnetic Fields, Daniel Katz, CRI
  • The Parallel Finite Element Method, Dave O'Neal, PSC
  • Mathematical Offsetting Scheme to Improve Alignment and Enhance Performance on MPP Systems, David Wong, NCSU
  • Parallel Simulated Annealing on the T3D, Carlos Gonzalez, PSC
  • AMBER 4.1 for the T3D, James Vincent, Penn State University
  • Parallel AMBER Enhancements and Particle-Mesh Ewald Electrostatics, Tom Darden, National Institute of Environmental Health & Safety
  • CHARMM, Bill Young, Pittsburgh Supercomputing Center
  • Implementing the Monte Carlo and Sparse Matrix Algorithms Needed for Lattice QCD on the T3D, Greg Kilcup, OSU
  • Lattice QCD Simulation Programs on the Cray T3D, S.P. Booth, Univ. of Edinburgh
  • High Performance MPP Codes, Nicholas Mark Hazel, Univ. of Edinburgh
  • Experience with Early Versions of the HPF Compilers for the T3D, Mike Ess, ARSC
  • F-- : A Minimalist's View of Parallel Fortran for Shared and Distributed Memory Computers, Robert Numrich, Cray Research, Inc.
  • Introduction to the CRAY T3E, Peter Rigsbee, Cray Research, Inc.
  • Design and Implementation of Efficient Bitonic Sorting Algorithms on the Cray T3D, Chua-Huang Huang
  • Optimizing the ARPS Model for Execution on MPPs, Adwait Sathye, Univ. of OK
  • I/O Optimization on the T3D, John Urbanic, PSC

T3E Software and Performance Overview

While at the Barcelona CUG, Tom Logan of the Alaskan SAR Facility at UAF got these updated numbers on T3E performance:
  • 2 microsecond communication latency on 512 PEs
  • 480 MB/sec bandwidth
  • 300 MHz 1.2 GFLOP peak Alpha processors
  • Up to 2 GB RAM per PE (this gives up to 1 TB RAM!)
  • Further optimization in compilers
  • Gigaring I/O speed
  • Ability to hot swap a module
  • More redundant PEs
  • Ability to mark network links as flawed and automatically reroute
  • Ability to restart a single PE
  • Overall price will be $65-$85 per MFLOP

Historical Questions

Last week, I asked these questions:
  • Which was delivered first, the Cray X-MP or the intel 80386?
  • Which was delivered first, the Cray Y-MP or the intel 80386?
  • Which was delivered first, the Cray Y-MP or the intel 80486?
Luckily, no one sent in any answers so I didn't have to disturb the current NQS priorities. Here are the answers:

  Cray-1    delivery:  3/76*
  Cray-X-MP delivery:  6/83*
  Cray-2    delivery: 12/84*
  Cray-Y-MP delivery:  8/88*
  Cray-C90  delivery: 11/91#
  Cray-T90  delivery:  3/95%

  8086  availability:  1978**
  80186 availability:  1981**
  80286 availability:  1982**
  80386 availability:  1985**
  80486 availability:  1989**

  Sources:
    *   CRI Corporate Computer Museum (Chippewa Falls) handouts
    #   Cray Channels, Volume 15, #2
    %   Wet memory
    **  "Computer Architecture, A Quantitative Approach",
        Hennessy & Patterson

How Much Memory on a PE is Available for the User?

Over the past two years, the above question has been one of the most frequent from potential T3D users. Of course, it depends on both the code and data that the user has in mind and it varies from one software release to the next. One way to answer the question is to run the program below and vary the value of the parameter NMAX:

          parameter( NMAX = 7 215 600 )
          real a( NMAX )
          do 10 i = 1, NMAX
             a( i ) = i
  10      continue
          sum = 0.0
          do 20 i = 1, NMAX
             sum = sum + a( i )
  20      continue
          if( sum .ne. ( NMAX * ( NMAX+1 ) ) / 2 ) then
             print *, "error in summation",NMAX, sum,(NMAX*(NMAX+1))/2
          else
             print *, "ok in summation",NMAX, sum,(NMAX*(NMAX+1))/2
          endif
          end
For the program main.f above, we have:

          sed "s/7 215 600/7 215 604/" main.f > test.f
          cf77 test.f
          a.out
   ok in summation7215604,  26032474150210.,  26032474150210
          sed "s/7 215 600/7 215 605/" main.f > test.f
          cf77 test.f
          a.out
  
  Agent printing core file information:
  user exiting after receiving signal 11
  Exit message came from virtual PE 0, logical PE 0x12c
  Register dump
  
   pa0: 0x00000060fc8efd18        pa1: 0x0000000000001e01  pa2: 0x0000000000000001
  
    pc: 0x000000200008cba0         sp: 0x00000060fc8effe0   fp: 0x00000060fc8f0100
    v0: 0x0000960000000000         ra: 0x000000200008cd50   ci: 0x0000960000000181
  
    a0: 0x0001070000000000         a1: 0x0000000000000001   a2: 0x00000060fc8f1cd0
    a3: 0x00000060fc8f1cc8         a4: 0x00000060fc8f1cd8   a5: 0x00000060fc8f1ce0
  
    t0: 0x0000004000010000         t1: 0x42b7ad28cdef3700   t2: 0x00000060fc8f02b8
    t3: 0x000000400000ff10         t4: 0x0000000000000007   t5: 0x0000000000000004
    t6: 0x0000000000000001         t7: 0x0000000000000002   t8: 0x0000000000000007
    t9: 0x0000000000000006         t10:0x0000000000000000   t11:0x000000000000003f
    t12:0xfffffc000017b0d8         t13:0x00000060fc8f1ce0
  
    s0: 0x000000400000ffb0         s1: 0x0000000000000001   s2: 0x0000000000000008
    s3: 0x00000060fc8f14f0         s4: 0x0000000000000001   s5: 0x0000000000000004
  
    f00:0x42b7ad28cdef3700   f01:0x3fb999999999999a   f02:0x42b7ad28cdef3700
    f03:0x0000000000000000   f04:0x0000000000000000   f05:0x0000000000000000
    f06:0x0000000000000000   f07:0x0000000000000000   f08:0x0000000000000000
    f09:0x0000000000000000   f10:0x0000000000000006   f11:0x0000000000000006
    f12:0x0000000000000000   f13:0x0000000000000000   f14:0x0000000000000000
    f15:0x0000000000000000   f16:0x42b7ad28cdef3700   f17:0x0000000000000000
    f18:0x0000000000000000   f19:0x0000000000000000   f20:0x0000000000000000
    f21:0x0000000000000000   f22:0x0000000000000000   f23:0x0000000000000000
    f24:0x0000000000000000   f25:0x0000000000000000   f26:0x0000000000000000
    f27:0x0000000000000000   f28:0x0000000000000000   f29:0x0000000000000000
    f30:0x0000000000000000                           fpcr:0x8900000000000000


  Agent finished printing core file information.
  User core dump completed (./mppcore)
  Make: "a.out" terminated due to signal 11
So of the 8MW memory on each T3D PE the user of the above program can have an array of 7.215605 million 64 bit floating point numbers.
Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top