ARSC HPC Users' Newsletter 282, December 5 2003

HLRS Wins Grid Contest at SC2003... with ARSC participation

ARSC contributed cycles from iceflyer to a global computational grid as part of the HPC Challenge at SC2003. The grid used a distributed version of MPI to run a bioinformatics code analyzing the evolution of arthropods.

The organizer of the project was HLRS, the Stuttgart High Performance Computing Center, and this project won the SC2003 HPC Challenge award.

A world map showing the participants, plus more descriptions of the software and project, is available at:

http://www.hlrs.de/news-events/2003/sc2003/HPC-CHALLENGE/

X1: PBS Limits Issues

TIME REQUESTS:

PBS scripts on the ARSC X1 should specify only one time limit, "walltime".

The T3E concept of "MPP time" is not supported. "CPU time" is supported, but is cumulative so a script for an N CPU job would need to request the desired per-processor CPU time multiplied by N. We have rejected this limit as too confusing. "Per Process CPU time" is supported, but a PBS job can run a series of processes as well as a parallel processes, so this limit could be abused.

Thus, for now at least, we've settled on "walltime" as the only configured time limit for PBS scripts. The following would request 8 hours:

#PBS -l walltime=8:00:00

Note that "walltime" only accumulates while the job is in the "run" state. Thus, the job isn't cheated out of time by being migrated or checkpointed.

CPU REQUESTS:

If your X1 batch job needs more than the default of 1 MSP, then your PBS script must request the number of processors. Use either the "mppe" or "mppssp" limit, but not both, as follows

#PBS -l mppe=<NUMBER OF MSPS>

--or--

#PBS -l mppssp=<NUMBER OF SSPS>

We recently discovered that if a script requests both "mppe" and "mppssp", PBS sums the two requests to determine the overall resource requirements of the job. A user was thus inadvertently requesting more processors than exist on the system, and PBS (appropriately) never started his job.

It's not required, but for clarity and consistency we suggest that your PBS script request MSPs if the executable code it will run was compiled in MSP mode. Similarly, request SSPs for executables compiled in SSP mode.

The system doesn't actually care. If your request is large enough, and you make the conversion 1 MSP == 4 SSPs, you can run an SSP-mode application in a script which requested MSPs and vice versa.

We don't anticipate much need for this, but if you execute both MSP and SSP applications from the same PBS script, you should still only request mppe or mppssp, but not both.

E.g.,


#PBS -q default          # Submit to routing queue "default"
#PBS -l mppe=4           # Request 4 MSPs (16 SSPs)
#PBS -l walltime=8:00:00 # Request 8 hours walltime
#PBS -j eo               # Combine stderr/stdout in output file 
#PBS -S /bin/sh          # Use this shell

# cd back to the directory from which the script was submitted
cd $QSUB_O_WORKDIR

mpirun -np 4  ./a.out_msp   # run an MSP mode executable
mpirun -np 16 ./a.out_ssp   # run an SSP mode executable

Iceflyer Hardware Upgrade

The 32 processors of ARSC's IBM Regatta server have been upgraded to 1.7GHz (6.8 GFLOPS) p690+ processors.

As described in issue #277 , iceflyer is partitioned into two nodes. Given this upgrade, these nodes now have the attributes:

ferry (interactive node): 7 p690+ processors: 14 GB memory umiak (batch node): 24 p690+ processors: 48 GB memory

A memory increase to a total of 256GB is scheduled for later.

EMBOSS installed on IBM p690+, iceflyer

From the EMBOSS web site, http://www.hgmp.mrc.ac.uk/Software/EMBOSS/


> EMBOSS is "The European Molecular Biology Open Software Suite".
> 
> EMBOSS is a new, free Open Source software analysis package specially
> developed for the needs of the molecular biology (e.g. EMBnet) user
> community. The software automatically copes with data in a variety of
> formats and even allows transparent retrieval of sequence data from the
> web. Also, as extensive libraries are provided with the package, it is a
> platform to allow other scientists to develop and release software in
> true open source spirit. EMBOSS also integrates a range of currently
> available packages and tools for sequence analysis into a seamless
> whole. EMBOSS breaks the historical trend towards commercial software
> packages.
> 
> The EMBOSS suite:
> 
>   - Provides a comprehensive set of sequence analysis programs 
>   - Provides a set of core software libraries (AJAX and NUCLEUS)
>   - Integrates other publicly available packages
>   - Encourages the use of EMBOSS in sequence analysis training
>   - Encourages developers elsewhere to use the EMBOSS libraries
>   - Supports all common Unix platforms
> 
> Within EMBOSS you will find around 100 programs (applications). 
> These are just some of the areas covered:
> 
>   - Sequence alignment
>   - Rapid database searching with sequence patterns
>   - Protein motif identification, including domain analysis
>   - Nucleotide sequence pattern analysis, for example to identify CpG 
>       islands or repeats
>   - Codon usage analysis for small genomes
>   - Rapid identification of sequence patterns in large scale sequence 
>       sets
>   - Presentation tools for publication
>   - And much more

There is a java gui interface to the EMBOSS suite available at ARSC called "jemboss". Before running jemboss copy the following file to your home directory.

/usr/local/pkg/jemboss/jemboss-1.1/jemboss.properties

Quick-Tip Q & A


A:[[ Sorry, I can't convert decimal to hex in my head.  Anyone have a
  [[ handy way to convert numbers between bases?  A Unix utility or
  [[ something better, perhaps?


#
# Five completely different answers... something for everone...
#

#
# From Jed Brown
# 

A quick answer is "bc," the POSIX arbitrary precision calculator.  There
are two special variables, ``ibase'' and ``obase'' which specify the
base of input and output respectively. There is no restriction to
integers. The following command returns, in base 16, the value of the
Bessel function J_5(7.3).

% echo 'obase=16; j(5,7.3)' 
 bc -l
.504F0C308EAD23048


#
# From Kurt Carlson
# 

While I prefer to use my handy HP calculator, nawk is your friend:

% echo 100 
 nawk '{printf("%d=0x%x\n",$1,$1)}'
100=0x64


#
# From Brad Chamberlain
# 

Here's my quick-and-dirty cheat:  fire up gdb.  To get hex->decimal,
just ask to print a hex expression, C-style:

(gdb) p 0x1234abcd
$1 = 305441741

Cast your base-10 integer as a pointer to go back from decimal->hex, 

(gdb) p (void*)305441741
$2 = 0x1234abcd

Not the most appropriate tool for the job perhaps, but since it's one I
already use and am comfortable with, it tends to be easier for me to
remember how to use than any other...


#
# From Greg Newby
# 

If you just want a table of values for letters and numbers, use "man
ascii" which lists values from 0-255 and their ASCII equivalent, in hex,
octal and decimal.

Another way is the Unix "printf" command (not to be confused with the
functions in C and other languages with the same name).

Print a decimal value in it's hex equivalent (to two places):  
% printf "%02x\n" 255 
ff

Print a hex value in its decimal equivalent:
% printf "%03d\n" 0x100
256


#
# From the editor
# 

Korn shell users can define the base of an input literal with the
syntax, "[base]#value".  

Set the ouput base of a variable with the command, 
"typeset -i[base] variable". (The default base is 10.)

% let h=2#1000000 
% echo $h        
64
% 
% let j=1023
% typeset -i2 j 
% echo $j      
2#1111111111
% typeset -i16 j
% echo $j
16#3ff
% let j=2#1000000
% echo $j       
16#40



Q: I'm finally appreciating the benefits of the "find" command, but
   here's a problem.  

   When I use grep from a find command, grep doesn't tell me the names
   of the files!  Sure I've got hits, but what good is it if I can't
   tell what files they're in?

     % find . -name "*.f" -exec grep -i flush6 {} \;
                  include(flush6)
                     !!dvo!! include(flush6)
               !!dvo!! include(flush6)
                  include(flush6)
              include(flush6)
              include(flush6)

   Any suggestions?

[[ Answers, Questions, and Tips Graciously Accepted ]]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top