ARSC HPC Users' Newsletter 254, September 13, 2002

Review of Bioinformatics Codes Available at ARSC

[ Thanks to Jim Long for this contribution. ]

ARSC is building a Bioinformatics infrastructure to support research in this important new area. To date, the following software has been installed on the SGIs, and is planned for the IBM platforms:

BLAST (Basic Local Alignment Search Tool)

A heuristic algorithm to query a protein or DNA sequence against available databases. Documentation: http://www.ncbi.nlm.nih.gov/BLAST/ for the ncbi web version.

CLUSTALW and CLUSTALX

A heuristic algorithm based on phylogenetic analysis for multiple sequence alignment. clustalx is a nice X window interface to clustalw allowing postscript output and other features.

FASTA & SSEARCH

FASTA is another heuristic algorithm similar to BLAST, while SSEARCH uses the Smith-Waterman algorithm. Documentation: http://www22.ncifcrf.gov/app/html/SeqAnaly/fasta/fasta_doc.html

HMMER

Protein sequence analysis using hidden Markov models. Documentation: http://hmmer.wustl.edu/

PHYLIP

A collection of software for inferring phylogenies. Documentation: http://evolution.genetics.washington.edu/phylip.html

SEQALN

A collection of software based on a library of functions to align nucleotide and protein sequences using Smith-Waterman. Documentation: http://hto-13.usc.edu/software/seqaln/

---

Currently, about 10 GB of nucleotide and protein sequence data is maintained on ARSC systems and updated on a monthly basis. All data is additionally formatted for use by BLAST. The current base files are:

ecoli.aa
E. coli genomic CDS translations (peptides).
ecoli.nt
E. coli genomic nucleotide sequences.
human.nr
All sequences from nr that have [Homo sapiens] in the comment field.
mouse.nr
All sequences from nr that have [Mus musculus] in the comment field.
nr
Non-redundant GenBank CDS translations+PDB+SwissProt+PIR (peptides).
nt (dna)
Non-redundant GenBank+EMBL+DDBJ+PDB nucleotide sequences (no EST, STS, GSS, or HTGS)
yeast.aa
Yeast (Saccharomyces cerevisiae) protein sequences.
yeast.at
Yeast (Saccharomyces cerevisiae) genomic nucleotide sequences.

Loadleveler, Who Am I?

MPI, PVM, SHMEM, OpenMP, all have mechanisms which allow a given task (or thread) to determine:

  1. how many tasks there are in total and,
  2. its own identity.
This is the most basic of information needed to parallelize a workload.

The same information can be obtained for processes started on different nodes of an IBM SP by poe ("parallel operating environment"), in LoadLeveler.

This can be used to parallelize some naturally parallel jobs by simply having poe launch multiple copies of a serial or threaded program, and having each such process read and crunch a different input file.

How can a process determine how many nodes are there, and which it is?

One way is to use the loadleveler environment variable, LOADL_PROCESSOR_LIST to get the complete list of nodes (by name), and the Unix "hostname" function, to get each individual name.

What follows is a perl subroutine "getNodeInfo" which implements this idea, contained in a test perl script.

"getNodeInfo" sets the four global variables,

  • LL_NUM_NODES
  • LL_MY_NODE_NUM
  • LL_MY_PROCESSOR_NAME
  • LL_PROCESSOR_LIST
for use by the calling perl script.

There's one complication (which may not apply on your cluster), in that LOADL_PROCESSOR_LIST returns switch names. On ARSC's SP, a simple "tr" converts these names, as shown in the script:


#!/usr/local/bin/perl -w

my ($myHost, $myNode, $hostList, $nNodes);

&getNodeInfo;

print "LL_NUM_NODES=$nNodes;\n";
print "LL_MY_NODE_NUM=$myNode;\n";
print "LL_MY_PROCESSOR_NAME=$myHost;\n";
print "LL_PROCESSOR_LIST=$hostList;\n";

# print "export LL_NUM_NODES LL_MY_NODE_NUM LL_MY_PROCESSOR_NAME;\n";



######################################################################
sub getNodeInfo () {
  my ($n);

  # For testing: --------------------------------------------
#   $hostList = "i1s1 i1s11 i1s16 i1s2 i1s7 i1s8 i2s18 i2s19 i2s21 i2s30";
#   $myHost = "i1n1";

#   # For real:    --------------------------------------------
  $hostList = $ENV{LOADL_PROCESSOR_LIST};
  $myHost = `hostname`;

  chomp $myHost;
  @procs = split ('\s', $hostList);


  $n = 0;
  $myNode = -1;

  foreach (@procs) {
    if ($myHost eq $_) {
      $myNode = $n;
    }
    else {
      #
      # Needed because "node" names in "LOADL_PROCESSOR_LIST"
      # appear as their corresponding "switch" names, which
      # are identical, except with the "n" translated into an "s".
      #

      tr/s/n/;  
      if ($myHost eq $_) {
        $myNode = $n;
      }
    }

    $n++ ;
  }

  $nNodes = $n;

  if ($myNode == -1) {
    print STDERR "ERR: $0 hostname $myHost not found in list $hostList\n";
  }
}
Here's a loadleveler script to test this on five nodes of icehawk:

  #!/bin/ksh
  #
  # @ output = $(Executable).$(Cluster).$(Process).out
  # @ error = $(Executable).$(Cluster).$(Process).err
  # @ notification  = never
  # @ wall_clock_limit=60
  # @ job_type = parallel
  # @ node = 5
  # @ tasks_per_node = 1
  # @ network.mpi = css0,not_shared,US 
  # @ class = large
  # @ node_usage = not_shared
  # # @ node_usage = shared
  # @ queue

  export POE=/usr/bin/poe

  cd /u1/uaf/baring/LoadLeveler

  echo "hostnames: "
  $POE  hostname

  echo "llNodeNum.prl: "
  $POE ./llNodeNum.prl
And here's output from a run of this loadleveler script:

  ICEHAWK2$ cat test_llNodeNum.ll.4628.0.out
  hostnames: 
  i1n5
  i2n20
  i2n23
  i3n40
  i3n44
  llNodeNum.prl: 
  LL_NUM_NODES=5;
  LL_MY_NODE_NUM=2;
  LL_MY_PROCESSOR_NAME=i2n23;
  LL_PROCESSOR_LIST=i1s5 i3s44 i2s23 i3s40 i2s20 ;
  LL_NUM_NODES=5;
  LL_MY_NODE_NUM=3;
  LL_MY_PROCESSOR_NAME=i3n40;
  LL_PROCESSOR_LIST=i1s5 i3s44 i2s23 i3s40 i2s20 ;
  LL_NUM_NODES=5;
  LL_MY_NODE_NUM=1;
  LL_MY_PROCESSOR_NAME=i3n44;
  LL_PROCESSOR_LIST=i1s5 i3s44 i2s23 i3s40 i2s20 ;
  LL_NUM_NODES=5;
  LL_MY_NODE_NUM=0;
  LL_MY_PROCESSOR_NAME=i1n5;
  LL_PROCESSOR_LIST=i1s5 i3s44 i2s23 i3s40 i2s20 ;
  LL_NUM_NODES=5;
  LL_MY_NODE_NUM=4;
  LL_MY_PROCESSOR_NAME=i2n20;
  LL_PROCESSOR_LIST=i1s5 i3s44 i2s23 i3s40 i2s20 ;

Let us know if you make use of this, and we'll have another article for the newsletter!

UAF / ARSC Courses

The following for-credit UAF courses are taught with reliance on ARSC hardware and sponsorship from ARSC.

ART 472 Visualization and Animation Bill Brody

An introduction to visualization and animation with applications in fine and commercial art and science. Students will produce a series of three dimensional animation projects which will introduce them to the tools and concepts used by animation and visualization professionals.

PHYS 693 Concepts in Parallel Scientific Computation Guy Robinson

This course will introduce the concepts of parallel scientific computation. Primarily in support of graduate students in the physical sciences, this course is designed for students with research interests requiring the application of parallel computation techniques for specific science applications. Topics will include the basics of problem decomposition, and how to identify the necessary communication, with particular attention to scalability and portability of the algorithm. Techniques to assess the reliability, stability and validity of large-scale scientific computations will also be covered. After successful completion of the course, students will be able to solve scientific problems on the parallel computers commonly found in the modern research environment.

BIOL F693/ CHEM F693 Special Topics in Bioinformatics Nat Goodman

This 2-credit course brought to you by the Institute for Arctic Biology, the UAF Chemistry department, the Arctic Region Supercomputing Center, and the UAF Biology department will introduce the concepts of bioinformatics through lectures, presentation, and student projects.

Topics:

  • Sequence comparison and database search: Smith-Waterman, FASTA, BLAST and similar methods.
  • Advanced sequence analysis methods: Hidden Markov Models, profile search, etc.
  • Microarray analysis: statistical issues, clustering, and other numeric methods.
  • Systems biology: pathway modeling and simulation, inference of regulatory networks.

ARSC Training, Fall 2002

All classes will be held at UAF in Butrovich Bldg rm. 007 starting at 2pm.

September 25th:

User's Introduction to ARSC Supercomputers.

  • Introduction to ARSC's supercomputers
    • Architectures and capabilities of the Cray SV1ex, Cray SX-6, Cray T3E, IBM SP cluster, IBM Regatta, and linux cluster.
    • Programming models
  • Programming Environments:
    • Compilers
    • Debuggers
    • Performance analysis tools
  • Running jobs
    • Interactive and batch
    • Submitting batch jobs
    • Checking job status

October 2nd:

Introduction to using the ARSC IBM SP and P690 (Regatta)

This course is an introduction to the ARSC IBM systems, icehawk and iceflyer.

Topics to be covered include:

  1. Architecture and storage overview
  2. Compilers and options
  3. LoadLeveler
  4. Performance monitoring/profiling
  5. Mixed-mode programming (MPI and openMP)
  6. A few words on writing portable makefiles and code

The class is intended for those who already have some computing experience and are interested in running codes on the ARSC IBM systems. After attending this class users will be able to determine which of the two ARSC IBM systems will best suit needs and how to develop, optimize and debug codes in the ARSC IBM environment.

More details and registration will be available presently. Watch the "Hot-Topics" on #250

AAAS Arctic Division Meeting, Next Week at UAF

The American Association for the Advancement of Science (AAAS) Arctic Division 2002 Meeting starts next Wednesday at UAF. For details, see:

http://arctic.aaas.org/meetings/2002/

Several ARSC users and affiliates are presenting research or otherwise involved.

Quick-Tip Q & A

A:[[ What's an "ulp"--a typo? a word? an acronym? I noticed it in issue
  [[ #250.  Should I care?


From the glossary of CrayDoc manual: "Cray T3E(TM) Fortran Optimization
Guide":


  ulp

      Unit of least precision. It is used to discuss the accuracy of
      floating-point data. It represents the minimum step between
      representable values near a desired number: the true result to
      infinite precision, assuming the argument is exact. For instance,
      the ulp of 1.977436582E+22 is 1.0E+13, since the least significant
      digit of the mantissa is in the 10^13 place. Within 0.5 ulp is the
      best approximation representable.

 

Thanks to those who pointed out that "ulp" didn't actually appear in
issue #250 . IBM's vector intrinsics documentation, referenced in that
issue, has it.  Sorry for the confusion... Here's the relevant section
-- "ulps" appears after the table:


  Mathematical Acceleration SubSystem (MASS)
  MASS Version 2.7

  [...]

  The following table provides sample accuracy data for the libx,
  libmass, libmassv, and libmassvp3 libraries. The numbers are based on
  the results for 10,000 random arguments chosen in the specified
  ranges. Real*16 functions were used to compute the errors. There may
  be portions of the valid input argument range for which accuracy is
  not as good as illustrated in the table.  Also, the user may
  experience accuracy which varies from the table when argument values
  are used which are not represented in the table.

  The Percent Correctly Rounded (PCR) column elements are obtained by
  counting the number of correctly rounded results out of the 10,000
  random argument cases. A result is correctly rounded if the function
  returns the IEEE 64 bit value which is closest to the
  infinite-precision exact result.

                          Math Library Accuracy

                         libm         libmass       libmassv    
  function range     PCR     MaxE   PCR     MaxE   PCR     MaxE
     exp     D       99.95    .50   96.55    .63   96.58    .63
    sexp     E      100.00    .50  100.00    .50   98.87    .52
     sin     B       81.31    .91   96.88    .80   97.28    .72
     sin     D       86.03    .94   83.88   1.36   83.85   1.27
     tan     D       99.58    .53   64.51   2.35   50.48   3.19
     [ ... 25 functions cut, for brevity ... ]


           * indicates hardware instruction was used

    Range Key        PCR  = Percentage correctly rounded
    A =    0,  1     MaxE = Maximum observed error in ulps
    B =   -1,  1
    C =    0,100
    D = -100,100
    E = - 10, 10

  [...]




Q: When I run my MPI program, some tasks start spitting error messages,
   which get all mixed up together, and then it stops.

   I'd like to know which message comes from which task, and, sure, I
   could fix the code so every message is prefaced with the task number,
   but I'd like an easier way.  Do you know one?

[[ Answers, Questions, and Tips Graciously Accepted ]]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top