ARSC T3E Users' Newsletter 184, Dec. 10, 1999

Report on Supercomputing Seminars

ARSC hosted four seminars at UAF this week. Here are notes from two:
"The State of HPF, OpenMP, and MPI" Barbara Chapman, University of Houston

This talk focused on HPF and OpenMP, spending less time on MPI, as it is well-known. The speaker did mention that Co-Array Fortran is gradually expanding its toe-hold, and that vendors other than SGI/Cray, like IBM, may be implementing it. Here are more detailed notes from the HPF and OpenMP portions of the talk:


  • HPF is struggling in the U.S., but in Europe, "HPF still sells machines."
  • Japanese vendors are working hard on HPF for the European market.
  • Programmers are getting excellent results with HPF, but people should realize that:
    1. porting applications to HPF is hard if performance is to be good
    2. you must often rethink the entire application
  • At the 1999 HPF User Group (HUG) meeting in LA, most participants were from Europe and Japan; the general level of satisfaction was very high; participation from the U.S. was low.
  • There's a large potential market for OpenMP on small SMPs and PCs.
  • User-base is growing rapidly and vendors are interested.
  • The standard is fairly stable.
  • OpenMP works well on small shared-memory systems.
  • Watch for extensions for SMP clusters.
  • It's easy to get started programming OpenMP, but hard to optimize.
  • The speaker cited some examples:
    • EPCC QMC Application
      • Has an arbitrary mix of MPI and OpenMP.
      • Designed for execution on clustered SMP systems.
      • The code began as an MPI code for the T3D.
      • Preliminary conclusions:
        1. development time is much faster for reasonable performance
        2. optimization is harder
    • HLRS CFD Code
      • A novice tester was given a time limit, and asked to parallelize this short code. Using MPI, he couldn't do it in time. Using OpenMP, he made it.
      • Given ample time, he eventually got superior performance from the MPI code than he could get from the OpenMP code.
    • NLOM NCOM Ocean Models
      • OpenMP outperforms MPI on HALO benchmark.
      • Now prefer OpenMP to MPI.
"Effective Parallel Programming in Advanced ZPL" Larry Snyder, University of Washington

ZPL is complete, available for several platforms, including the T3D, T3E, SP-2, Intel Paragon, and linux, is in use by a number of researchers at several institutions worldwide.

"Advanced ZPL", or "A-Zpl", is a superset of ZPL, in the final stages of development, and to be available soon.

The language is based on two abstractions:

  1. the array data type (it's an "array" language)
  2. the CTA model of the underlying parallel architecture

Programmers think and program in terms of arrays and regions of arrays (including sparse regions, in A-Zpl). Also, they should program with the CTA model (a common network shared by von Neumann machines) in mind, as the language uses this model to make the best translation to whatever actual architecture is being used.

"WYSIWYG Performance Evaluation" is a term coined by the ZPL group. As I understood this, since the array syntax is very tidy and relates directly to the CTA abstract machine, it:

  • describes data allocation,
  • describes parallelism, and
  • defines when communication will be generated.
Thus, the programmer can scan a program and make reasonably accurate performance predictions before running it.

The speaker gave a few examples of coding efficiency, for instance, the Parallel NAS MG benchmark takes:

~1000 lines in MPI,
~500 lines in HPF, and
~200 lines in A-Zpl.
In this comparison, the A-Zpl version also gets the best performance.

NOTE: ZPL can be made available to ARSC users wishing to experiment. (See: for more).

"qsub -a" -- For Submitting Independent Jobs

At ARSC, only one job per user per NQS queue is allowed at a time (see "news queue_policy" or for details).

This policy improves fairness but inconveniences users who would prefer to submit three jobs on Friday and spend the weekend ice fishing.

We believe that "chaining," as described in issue #176,


is the best solution, but careful use of "qsub -a" can also work for some users. The problems with the "qsub -a" approach are that the jobs might not run in the expected order (thus, they MUST be independent) and they might wind up violating the queue policy.

Given these pit-falls, here's how it works. From "man qsub":

     -a      Specifies the earliest date and/or time at which NQS should
             run the request.
Using "qsub -a", you could submit three jobs, to be queued on:

Friday @ (right now!)
Saturday @ 3pm
Sunday @ 3pm


The date/time argument to "-a" takes many forms. In "man qsub," look under:

   "Detailed Explanation of Options"
An easy form is "<DAY-OF-WEEK>, <TIME-OF-DAY>". For instance:

   -a "Saturday, 3pm"
qsub lets you combine command line options with #QSUB options inside the script. Assuming three otherwise complete scripts, you can add the "-a" specification at submit time. For example:

  yukon$  qsub mfd.qsub
  yukon$  qsub -a "Saturday, 3pm"  bfd.qsub
  yukon$  qsub -a "Sunday, 3pm"  cfm.qsub
Here's what happens:

  yukon$  qstat -a
  -----------   ------- -------- -------------- ---- ------ ------ ---
  25413.yukon   mfd.qsu myname   xlarge@yukon    599  32768  28800 Qge
  25414.yukon   bfd.qsu myname   xlarge@yukon    ---  32768  28800 W  
  25415.yukon   cfm.qsu myname   xlarge@yukon    ---  32768  28800 W  
The "ST" (status) column shows that only the first job is queued ("Q"). Thus, this isn't a queue violation. The other two jobs are in "W" state (waiting to be queued).

The time at which a "W" job is scheduled to be queued is available from "qstat -f":

  yukon$  qstat -f 25414

        Created: Fri Dec 10 1999     To be queued: Sat Dec 11 1999
                 14:59:14 AKST                     15:00:00 AKST  

Like queued jobs, waiting jobs can be removed with "qdel":

  yukon$  qdel 25414
What happens next?

On Saturday at 15:00, job 25414 will be released to its queue (moved into "Q" status), and if nothing is ahead of it and PEs are available, it will start to run.

If the Friday job is either running or still queued as well, then it would be in violation of the queue policy.

The predicatability depends on system load, checkpointing, whether there was downtime, etc. Try to judge usage levels, and increase the lag between jobs during busy times.

Please limit the number of jobs in the "W" state to two.

Longer Limits on Yukon Queues

On Dec. 8th, yukon's xlarge and xxlarge queues were expanded to permit 8 hour runs. From "news queues" on yukon:

 Name of        Number    Maximum jobs   Time limit    When queue is
 queue          of PEs     per queue      on queue      executable
 -----------  ---------  ------------   ----------   ----------------
 grand        161 - 256       1          4  hours        always
 xxlarge      101 - 160       2          8  hours        always
 xlarge        51 - 100       2          8  hours        always
 large         21 - 50        4          8  hours        always
 medium        11 - 20        4          8  hours        always
 small          2 - 10        4          8  hours        always

Text of "Numerical Recipes" On-Line

Cambridge University Press has now given permission for the page images from "Numerical Recipes in Fortran 90" to be made available on the Web, along with the other Numerical Recipes books (C and Fortran 77) that were already available.

This means that anyone in the world can access the book for free (though it is hoped that they will want to buy the hardcopy, and there is still the obligation to purchase the software for more than very casual use).

The book pages can be found by starting at:

and clicking on Books-On-Line.

NOTE: ARSC's recommended reading list will be revamped for the next issue.

Quick-Tip Q & A

 A: {{ I'm in a predicament.  My colleague finally sent me some critical
   {{ files, but now he's off to Roswell for a millennium vigil, and I
   {{ can't untar them!
   {{ It looks like he gave absolute paths when he tar'ed the files, 
   {{ but the system won't let me recreate the paths:
   {{   c-yukon<36> tar tf progs.tar
   {{   /tmp/vgt_tmp/progs/ellips.c
   {{   /tmp/vgt_tmp/progs/genseq.c
   {{   c-yukon<37> tar xf progs.tar
   {{   cmd-3749 tar: mkdir '/tmp/vgt_tmp/progs' failed: Permission denied
   {{   cmd-1004 tar: Cannot create '/tmp/vgt_tmp/progs/ellips.c'.
   {{   cmd-3749 tar: mkdir '/tmp/vgt_tmp/progs' failed: Permission denied
   {{   cmd-1004 tar: Cannot create '/tmp/vgt_tmp/progs/genseq.c'.
   {{ How can I extract these files?

Thanks go to two readers:
  From Richard Griswold:
    Use GNU tar, which strips leading /'s by default.  In fact, you need
    to specify -P with GNU tar to avoid stripping leading /'s.
    More information on GNU tar is available at:

  From Mark Reed:
    On SGI systems the tar command has an "R" option to do a relative
    untar.  This will help your colleague who has a "millennium" problem :)

Editor's Note:
  gnutar is installed on all ARSC systems. For help, go to the URL 
  given above, use the GNU info reader by simply typing, "info", on 
  any system, or get help with the command, "gnutar --help".

Q: Holiday fun... Share a tip on ANY SUBJECT!  Your favorite.  Anything! 
   For instance, what do you do well?

     Tie flies?  
     Hang glide?
     Win at poker?  
     Organize closets?
     Photograph the aurora?
     Make cheddar cheese blintzes?
     Travel the Sarengetti?
     Entertain children?
     Shop on-line?  
     Wrap gifts?
     Tune bicycles?  

   All tips will appear anonymously... please try for 35 words or less.

[ Answers, questions, and tips graciously accepted. ]

Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top