ARSC HPC Users' Newsletter 391, July 25, 2008

Tom Baring Passes the Editor's Torch

[ By: Tom Baring ]

Mike Ess emailed "ARSC T3D Users' Group Newsletter" issue #1 in August 1994. This was before ARSC had a web site.

The T3D, with 128 processors and total peak performance of, OMG, 19.2 gflops, was a source of pride, even if almost no one knew what to do with it.

T3D jobs were constrained to powers-of-two numbers of processors, and the machine lived as an unnamed appendage to "denali," the Y-MP M89, the largest memory shared-memory supercomputer at the time, with a whopping 8 GB!

This was only 14 years ago.

Then came the web edition of the T3D Newsletter, Mike left for Seattle , I became the editor (was beer involved?), then in 1997 the T3E (yukon) and Guy Robinson arrived. Guy became co-editor , and the "T3D Users' Group Newsletter" morphed into the "T3E Users' Newsletter".

The T3E was probably the best machine we ever had. Users loved the low latency, and even seven years later, in 2004, we had to chase them off with a broom before we could unplug it.

We'd had other machines to talk about during yukon's long life, though, including icehawk, iceflyer, iceberg, chilkoot, and rime (respectively, the SP3, P690+, P655+/P690+, J90/SV1/SV1e/SV1ex, and NEC, I mean Cray, SX-6) so, in 2000, with issue 201 , we promoted the "T3E" to the "HPC" newsletter.

Guy left for Japan , but Don Bahls stepped in to co-edit, we installed klondike (the Cray X1) and midnight (Sun Cluster) and this fall comes pingo (Cray XT5).

It's been great researching topics for articles, dreaming up contests and sending sourdough starter to the lucky winners, mailing letters from Santa at Christmas time, debating English usage with Guy and Don, turning user problems into quick-tips, and getting to know you readers.

Certain people are always good for a ZSH answer. Others only seem to know python!

My interests have moved sideways to web technologies, though, so while I'm still at ARSC and will definitely miss editing this newsletter, I'm happy to announce that Don has a new co-editor, ARSC HPC Specialist and Group Leader, Ed Kornkven. I'm confident they'll keep it informal and keep the focus on prompt sharing of information for the folks in the trenches, optimizing the loops, moving terabytes, and queuing the jobs.

Another PBS Dependencies Article

[By: Don Bahls]

We have done a few articles on how to set up job dependencies in PBS (See newsletters 319 , 320 and 322 ). Those articles gave an idea of the dependency options available in PBS and how to use them with job chaining. In this article we will take a look at a few ways you can script the setup of dependencies when you want to submit multiple jobs at once.

When you submit a job using the "qsub" command, PBS displays the jobid of the job.

E.g.


  mg56 % qsub myjob1.pbs
  520613.mpbs1

Using the bash/ksh command substitution operator (i.e. $( command )) we can save the jobid that is returned from "qsub".

E.g.


  mg56 % jobid=$(qsub myjob1.pbs) 
  # The variable $jobid is now set to the jobid.
  mg56 % echo $jobid
  520616.mpbs1

Both csh and tcsh support command substitution with the "backtick" operator.

E.g.


  mg56 % set jobid=`qsub myjob1.pbs`
  mg56 % echo $jobid
  520617.mpbs1

If you use the same type of dependency (e.g. "depend=afterok") for each job you can easily loop through the list of scripts to submit.

E.g.


  mg56 % more submit_jobs.bash 
  #!/bin/bash
   
  jobid="";
  for job in myjob1.pbs myjob2.pbs myjob3.pbs; do 
      if [ "$jobid" = "" ]; then
          jobid=$(qsub $job) 

 exit 1
          echo "Submitting $job --" $jobid;
      else
          oldid=$jobid
          jobid=$(qsub -W depend=afterok:$jobid $job) 

 exit 2
          echo "Submitting $job --" $jobid " (dependent on $oldid)";
      fi
  done 

Notice, this short script uses the "afterok" dependency for each PBS script which makes things simpler. It also verifies that the job was successfully submitted. Running the script sets up the required dependencies for each job.

E.g.


  mg56 104% ./submit_jobs.bash 
  Submitting myjob1.pbs -- 520630.mpbs1
  Submitting myjob2.pbs -- 520631.mpbs1  (dependent on 520630.mpbs1)
  Submitting myjob3.pbs -- 520632.mpbs1  (dependent on 520631.mpbs1)

If you have more complicated dependencies, you might think about using a more sophisticated scripting language like perl or python, or abandon the complexity and list each script individually.

E.g.


  mg56 105% more submit_simple.bash
  #!/bin/bash

  jobid=$(qsub myjob1.pbs) 

 exit 1
  jobid=$(qsub -W depend=afterok:$jobid myjob2.pbs) 

 exit 1
  jobid=$(qsub -W depend=afterany:$jobid myjob3.pbs) 

 exit 1
  jobid=$(qsub -W depend=afterany:$jobid myjob4.pbs) 

 exit 1

Midnight Software Updates

There have been a number of software updates on midnight in the last month. Here's a list of new and updated packages:

  • Git (version control package) version 1.5.6 Available via the git or git-1.5.6 modules.
  • Integrated Data Viewer (IDV) version 2.5 Available via the idv-2.5 module.
  • Matlab version 7.6.0 Available via the matlab-7.6.0 module.
  • NCAR Command Language (ncl) version 5.0.0 Available via the ncl-5.0.0 module.
  • PathScale Compiler Suite version 3.2 Available via the PrgEnv.path-3.2 module.
  • Portland Group Compiler version 7.2.2 Available via the PrgEnv.pgi-7.2.2 module.

See "news software" on midnight for recent software updates.

Challenges 2008

The 2008 edition of ARSC's Challenges publication is now available online. This year Challenges covers a wide variety of topics including: Arctic Ocean modeling, the ARSC Playstation 3 compute cluster, tsunami modeling, ice porosity research, virtual Mars and impact craters.

You can find the 2008 edition of Challenges here:

    http://www.arsc.edu/challenges/

Quick-Tip Q & A


A:[[ When I do an "ls -lt" to see what directories are available,
  [[ I *frequently* want to "cd" into the directory which sorts to the
  [[ top of the listing.
  [[
  [[ Is there a shortcut, so I can do this without typing the name of
  [[ the directory, or even worse, using a GUI interface?


#
# Ryan Czerwiec shared the following solution
#

Here's a command that could be aliased:
cd `ls -lt 
 grep "^d" 
 head -1 
 awk '{print $NF}'`

This will break if the directory name has spaces in it, but it's a
good start and should cover most users.


#
# Rich Griswold offered this solution.
#

This appears to work:

  cd `ls -td */ 
 head -1`

However, it can give unexpected results if there are not
subdirectories.  It wouldn't be difficult to make this into a script
with appropriate error handling.


Q: I have a directory (with subdirectories) that has a bunch of
   duplicate files.  These files could have the same contents, but
   not have the same filename.
  
   Is there a good way to identify files with identical contents
   without doing a diff on each pair of files?

[[ Answers, Questions, and Tips Graciously Accepted ]]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top