ARSC T3E Users' Newsletter 191, March 16, 2000

Yukon is busy! Help Us Provide More Cycles!

There are several reasons why application (APP) PEs on a T3E may be idle even when NQS jobs are queued, dying to run. One of them is addressed by a simple addition to NQS scripts.

This change is so simple that we ask all ARSC users to make it. Here's the explanation:

When the last "mpprun" in your qsub script returns, the APP PEs that it was using become idle. Since this was the LAST mpprun, your script no longer needs the APP PEs. However, NQS will not release them to other jobs until your script either terminates or explicitly release them.

Please explicitly release the PEs by adding the following two lines immediately after the last "mpprun" command in your qsub scripts:


  qalter -l mpp_p=0                    # release parallel processors
  echo 
 qsub -q mpp -eo -o /dev/null  # force NQS to rescan queues
This will release the APP processors and tell NQS to start someone else's job, while allowing your job to complete all of its normal single-processor work.

It's important to add the above lines before ANY "post-mpprun" commands. Even innocent looking commands, like "mv" or "cp", can take a very long time (if the file system is busy or the files big, for instance).

As an example, this script:


  #QSUB -q mpp
  #QSUB -l mpp_p=50
  #QSUB -l mpp_t=4:00:00
  
  cd /u1/uaf/morris/progs
  mpprun -n50 ./myprog data1
  mv restart.* ../RESTART/
  cat myprog.log >> run.log
would be modified to this:

  #QSUB -q mpp
  #QSUB -l mpp_p=50
  #QSUB -l mpp_t=4:00:00
  
  cd /u1/uaf/morris/progs
  mpprun -n50 ./myprog data1

  qalter -l mpp_p=0                    # release parallel processors
  echo 
 qsub -q mpp -eo -o /dev/null  # force NQS to rescan queues

  mv restart.* ../RESTART/
  cat myprog.log >> run.log
NOTES:
  1. Here's an explanation of the command,
    
        echo 
     qsub -q mpp -eo -o /dev/null  # force NQS to rescan queues
    
    It is required because "qalter" doesn't do the obvious and tell NQS to check for waiting jobs. The command submits a job which will run, but have no effect except to awaken NQS. Here's what the options do:
    • " -q mpp ", in the absence of any " -l " options, runs the job in yukon's single queue.
    • " -eo " causes the script's STDERR to be merged with its STDOUT.
    • " -o /dev/null " sends the STDOUT to /dev/null (instead of creating a " .o " file in your directory).

    The " echo | " pipes a new-line character to the STDIN of "qsub". This is needed because " qsub ," if no script file is given, expects input from STDIN. In the absence of the " echo | " the command will hang, waiting for input.

  2. For anyone who is "chaining" jobs, the above is an alternate method for submitting the required "do_nothing" job,we described in newsletter #176 ( /arsc/support/news/t3enews/t3enews176/index.xml ).

    Also, if you're chaining, please move the above pair of lines from the end of your script (if that's where it is now) to immediately following the last " mpprun ".

  3. Please send us feedback on yukon throughput. We hope to see improvements if everyone honors the above request. Also, let us know if you have other ideas for improving efficiency.

Faster T3E Code

Several ARSC users have recently reported significant improvements by using the split2 option on the f90 compiler.

This option splits loops to make better use of the streams into memory hardware. In some cases there is no gain, other users report a 10-20% improvement. As with any higher level of optimization, you should perform test cases, both in terms of performance changes and in verifying results are still correct.

As discussed in the article, "CF90 Optimization Options," in newsletter, #127,

/arsc/support/news/t3enews/t3enews127/index.xml

an aggressive set of options to try with the f90 compiler might be:

  -O3,aggress,unroll2,pipeline2,split2  

Three good books for spring reading

The Clockwork Muse: A practical guide to writing theses, dissertations, and books. Eviatar Zerubavel. Havard University Press. ISBN 0-674-13586-5.

The Nature of Mathematical Modeling. Neil Gershenfeld. Cambridge University Press. ISBN 0-521-57095-6.

Structured Adaptive Mesh Refinement (SAMR) Grid Methods. Scott GBaden, Nikos Chrisochoides, Dennis Gannon, Michael Norman. Springer. ISBN 0-387-98921-8.

Reading the above set should give you ideas on how to improve your algorithms and tell people how you did it in time for the summer conference season.

CUG SUMMIT 2000 Preliminary Program Now On-Line

The complete Preliminary Program for the upcoming CUG SUMMIT 2000 in Noordwijk is on-line.

It is accessible from the CUG home page at http://www.cug.org/

from the European server at http://cug2000.sara.nl/ (Noordwijk CUG SUMMIT 2000 Home Page)

from the US server at http://www.fpes.com/cug2000/ (CUG Office CUG SUMMIT 2000 Home Page)

Arctic Climate Modelers to Meet in Fairbanks, September 2000

From an announcement we received recently:

> ARC-MIP, the Arctic Regional Climate Model Intercomparison Project will
> hold its first meeting at Fairbanks, Alaska, USA, on September 13-15.
> This meeting will coincide with the WCRP ACSYS Numerical
> Experimentation Group meeting which will be held the same week on
> September 11-12.
> 
> In ARC-MIP, models developed by research teams from Europe, Australia,
> USA, and Canada are invited to perform a common set of simulations over
> two common domains: one that covers much of the Arctic Ocean, and a
> second that concentrates at higher grid resolution over the western
> Arctic corresponding to the location of the SHEBA ice camp,
> 
>   
http://sheba.apl.washington.edu/.

> 
> Participation in the workshop and the ARC-MIP project is open. Modelers
> and observationalists are invited to attend this workshop. The first
> workshop will be rather exploratory. Difficulties encountered in
> modeling the Arctic (clouds, surface schemes, dynamics, etc.) will be
> discussed as well as observations taken during the SHEBA experiment.
> Issues of funding will be discussed. The common simulations to be
> performed will also be defined during the workshop. Oral presentations
> by workshop participants are welcome.
> 
> A preliminary ARC-MIP web site is located at:
> 
>   
http://cires.colorado.edu/lynch/workshop/

Quick-Tip Q & A



A: {{ I want to share some directories with members of my group, giving
   {{ them write access. This command:
   {{
   {{    chmod -R g+rwX ~
   {{
   {{ would work, but I don't want anyone messing with my dot files.
   {{ How can I share a directory with my group?


    Make your home directory group-executable (only).  Group members
    will be able to "cd" into your home directory, but neither see nor
    change anything (thus, your dot files will be safe).  This command
    would do it:

      chmod g=x ~

    Next, give your group rx or rwx permission on the desired
    subdirectories and files.  You might create a subdirectory,
    "GROUP", and store in it everything needed by the group.  The
    following command adds group-read and -write permission to
    everything below GROUP and adds group-execute to everything below
    GROUP that was owner-execute.

      chmod -R g+rwX ~/GROUP

    With this set-up, group members can cd through the "blind" home
    directory into GROUP, assuming they know in advance of its
    existence.  Once in GROUP they have full access.

    (By the way, it's a security policy violation at ARSC to give write
    access to your home directory, except to yourself.)



Q:  I'm in four Unix permission groups:

      yukon$ groups 
      hyprfast bigfoot wulfdown marsbar

    Whenever I create a new file, it's in "hyprfast," but I've left
    those guys behind.  So I'm always running, or forgetting to run,
    chgrp:

      yukon$ chgrp wulfdown new.file.out

    How can I change my default group to "wulfdown"?

[ Answers, questions, and tips graciously accepted. ]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top