ARSC T3D Users' Newsletter 79, March 22, 1996

MPP NQS Queues

Recently the ARSC T3D has been so busy that there are several NQS jobs submitted for the same queue waiting to run. This is a relatively new situation for ARSC's T3D which has previously been busy with only a few users of the big queues. It is the practice at other MPP sites that users submit only one job to each queue at a time. If each of the jobs runs for the same amount of time, this "neighborly" approach divides the available machine cycles of that queue evenly among the users of that queue. What it prevents is one user shutting out all other users. Let's try this self-policing policy and see what happens.

PGHPF is Running at ARSC

The Portland Group has kindly extended ARSC's license to use their HPF compiler until July 4th, 1996. Please contact Mike Ess if you have any problems or results from using pghpf. Instructions for using pghpf at ARSC were described in newsletter #65 (12/15/95).

New PGI Newsletter

Below is a description of the Portland Group's new performance oriented newsletter:

> The first electronic issue of Peak Performance is now available from The
> Portland Group, Inc.  This newsletter serves as a forum for the exchange 
> of ideas on products, research, programming, code porting, and hardware 
> for the field of High Performance Computing (HPC). 
> Contents of Spring 1996 Issue
>  *  About Peak Performance
>  *  Parallel Programming Tips
>  *  Application and Tool Profiles
>  *  Code Porting Information
>  *  HPC News
>  *  PGI News
> The first issue is available at the web site:
> Subscribing and Unsubscribing
> Anyone can subscribe to Peak Performance. With your subscription, you will
> receive e-mail notification when a new issue of the newsletter is placed on
> our web site. You can also subscribe to receive the text version of Peak
> Performance by e-mail.  These two lists are maintained as "newsletter" 
> and "enews" ("enews" for the full text in e-mail). 
> If you want to add or remove yourself from the newsletter or enews 
> mailing lists, you can use the form found at the above address, or
> send mail to "" with the one of the following 
> commands in the body of your e-mail message (where email@site is 
> your e-mail address):
> To subscribe or unsubscribe, send one of the following
> to "":
>     subscribe newsletter email@site
>     unsubscribe newsletter email@site
>     subscribe enews email@site
>     unsubscribe enews email@site

Supercomputing Workshop at Pittsburgh

> The Pittsburgh Supercomputing Center (PSC) is offering a new kind of 
> supercomputing techniques workshop for biomedical researchers this May.  
> We invite you to review the description below and consider returning an
> application for it.
> Please contact me if you have any questions.
> Nancy Blankenstein
> Biomedical Assistant
> *******************************************************************************
>                   Pittsburgh Supercomputing Center
>                            May 5-9, 1996
> This newly developed Supercomputing Techniques workshop is aimed at 
> researchers who want to determine if they need supercomputing resources 
> to solve their biomedical research problem(s). It differs from both the
> Biomedical Applications workshops, which focus on science, and other PSC 
> Techniques workshops, which focus on programming details.  The emphasis here
> will be on practical concepts and assumes no prior supercomputing experience.  
> Applicants should have a working knowledge of either Fortran or C.
> Participants will learn what resources are potentially available to them 
> through  PSC's Biomedical Initiative, including hardware, software and PSC 
> staff expertise.  By participating in this course, you should be able to 
> answer the following questions as they pertain to your research computing 
> needs:
>         How can my application benefit from supercomputing?  What must I
>         examine to determine this?
>         Is my application massively parallel in nature?  Is it more vector 
>         oriented?
>         Should I consider a heterogeneous solution involving both vector and
>         massively parallel machines?
>         Are there ways to restructure my application to get more computing
>         power out of the machines already at my        disposal?
>         How much effort would be involved to accomplish the necessary
>         modifications, and what are the potential payoffs?  Is it worth
>         pursuing at this time?
>         Where do I go from here?
> The workshop will include informal discussion times to encourage participants
> to collaborate with one another as well as PSC researchers and scientific
> support staff.  A panel discussion will be held on the final day to further
> promote discussions between participants and members of the Biomedical
> Supercomputing Initiative.
> This workshop is NOT intended to provide detailed information on the use of any
> one computer system.  Other techniques workshops are available that address the
> details of programming either the C90 or the T3D.
> Expenses/Registration Fees:
>         Researchers affiliated with a U.S. academic institutions will have 
>         their hotel accommodations paid and receive complimentary breakfasts 
>         and lunches the days of the workshop.  No registration fee will be 
>         charged but participants are responsible for all other expenditures 
>         connected with attending the workshop, i.e., travel, meals outside the
>         workshop, ground transportation, parking, etc.
>      A few openings may be available for government and industrial researchers:
>         U.S. Government researchers will be charged a registration fee to 
>         cover their documentation and workshop meals.  They will be 
>         responsible for all expenses incurred in travel, accommodations,
>         other meals, etc.
>         Industrial researchers will be charged a registration fee to
>         cover their service units, documentation and the workshop meals.
>         They are responsible for all expenses incurred in travel,
>         accommodations, other meals, etc.
> This program is sponsored by a National Institutes of Health grant. Enrollment
> is limited to 20 participants.  Deadline for applications:  April 5, 1996
> A tentative agenda and an on-line application are below.
> *******************************************************************************
>                                May 5-9, 1996
>                                 TENTATIVE AGENDA (as of March 15, 1996)
> Sunday: May 5, 1996
>  Introduction to PSC and 
>  the Biomedical Supercomputing Initiative
>  Introduction to PSC Environment (Interactive)
>  Optional computer lab time (each evening of the workshop)
> Monday: May 6, 1996
>  Introduction to Supercomputing                     
>  Parallel Computing Concepts and Hardware             
>  C90 and T3D Architecture Overview             
>  Parallel Computing Paradigms
> Tuesday: May 6, 1996
>  PVM Basics
>  MPI Basics                                     
>  Heterogeneous Computing
>  Westinghouse Tour
> Wednesday: May 7, 1996
>  Performance Monitoring                        
>  Optimization Techniques                       
>  Practical Considerations
> Thursday: May 8, 1996
>  Heterogeneous Scientific Applications                
>  Panel Discussion                            
>  Collaborations/Optional Lab Time
> *******************************************************************************
>                           PITTSBURGH SUPERCOMPUTING CENTER
>                                BIOMEDICAL INITIATIVE
>                                        May 5-9, 1996
>                                      APPLICATION
> Name:               ________________________________________________________________
> Affiliation:   ________________________________________________________________
> Address:       ________________________________________________________________
>                (Business)
>                ________________________________________________________________
>                ________________________________________________________________
>                (Home)
>                ________________________________________________________________
> Telephone:  ____________________________         ______________________________
>                    (Business)                                     (Home)
> *Social Security Number:  _______-_____-_______        Citizenship:___________________
> Electronic Mail Address:_______________________________________________________
> Status: ___Graduate  ___Post-doctoral Fellow  ___Faculty  ___Other (specify)
> Please indicate specifically any special housing, transportation or dietary
> arrangements you will need: ___________________________________________________
> How did you learn about this workshop:_________________________________________
> Applicants must submit a completed application form and a cover letter.  The
> letter should describe, in one or two paragraphs, your current research, and
> how participating in the workshop will enhance this research.  Please include 
> a brief statement describing your level of experience with computers and with
> FORTRAN or C programming language.  Faculty members, staff and post-docs
> should provide a curriculum vitae.  Graduate students must have a letter of 
> recommendation from a faculty member.
> Please return all application materials by April 5, 1996 to:
>   Biomedical Workshop Applications Committee
>   Pittsburgh Supercomputing Center
>   4400 Fifth Avenue, Room 230C
>   Pittsburgh, PA 15213
> For additional information:
> Direct inquiries to Nancy Blankenstein, or (412) 268-4960.
> *Disclosure of Social Security Number is voluntary.
> PSC does not discriminate on the basis of race, color, religion, sex, age, 
> creed, national or ethnic origin, or handicap.

NQS Queue and Fair Share Scheduler

As the ARSC T3D gets busier, we have more cases where two or more users are competing for the same NQS mpp queue. I have said, in earlier editions of this newsletter, that the NQS mpp queues operated on a first-come/first-serve basis. I was wrong about that, as a situation this week illustrated. I'm trying to explain the current operation, this is what I have so far.

Assume that a mpp queue is in use, and that, while in use, two users submit additional jobs to that queue. When the current job completes, one of the two submitted jobs will take its place. Originally, I thought that the job submitted first will now execute, this would be the first-come/first-serve operation.

But, while the two submitted jobs are in the NQS, each builds up a priority that will be evaluated when the current job in that queue completes. This computed priority will determine which of the two submitted jobs goes first. The priority in the NQS queue is computed from 4 terms:

  1. The Y-MP time requested
  2. The Y-MP memory requested
  3. How long in the NQS queue
  4. Fair Share Scheduler priority
The first two terms come from the NQS script lines or are gotten from the UDB defaults if they are not specified in the NQS script. The third term is what I thought was the only term used and so implemented the first-come/first-serve operation. The fourth term, the Fair Share priority, has a weight factor that is 4 times the other terms and because most jobs probably request identical Y-MP time and memory limit (the UDB defaults), it has the dominating effect on which of the two submitted jobs goes first.

To improve the priority in the queue, a user can specify lower limits on the Y-MP memory and Y-MP time limits that normally come from the UDB limits. The UDB defaults are probably much larger than what an mppexecs will ever need. These limits apply to only the mppexec programs that form the communication and control between the Y-MP and T3D. The combination of:

  a.out   # t3d job
  ja -st
will report the actual Y-MP time and memory used by the mppexecs for the mpp job. The memory and time usage from a run may then be used at estimates for NQS script lines to specify limits lower than the UDB limits.

The Fair Share scheduler is something of a mystery to me (I'm not alone in this), but luckily, there is a sequence of papers in the CUG proceedings that give some explanation of it:

Spring, 1994
Priority-Based Memory Scheduling, Chris Brady, CRI, Don Thorp, CRI and Jeff Pack, Grumman Data Systems, Inc.
Spring, 1995
What Every Administrator Should Know Before Trying To Set Up a Shared Hierarchy in the UDB, Kathlean Zinnel, CRI
Fall, 1995 (the Alaska CUG)
The UNICOS Fair Share Scheduler as a Feedback Control System, Cass Everitt and Terry Jones, Grumman Data Systems and Services, Inc, Robert Knesel, Naval Oceanographic Office
I can mail copies of these papers to anyone who asks. And there is the CRI documentation that is intended for system administrators:

  UNICOS System Administration, Volume 2, SG-2113 
I'll describe more about the the Fair Share Scheduler and how it affects T3D jobs in the NQS queues in future newsletters.

Historical Questions

  • Which was delivered first, the Cray X-MP or the intel 80386?
  • Which was delivered first, the Cray Y-MP or the intel 80386?
  • Which was delivered first, the Cray Y-MP or the intel 80486?
Answers in next week's issue. (massive NQS priorities for correct answers)
Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top