ARSC HPC Users' Newsletter 356, February 23, 2007

IBM- 5 Tips for Building Applications on AIX

A few weeks ago, I was helping a researcher port his code to iceberg. As we were going through the process, I started thinking about all of the little tricks that make the process easier. Moreover I remembered how difficult it could be to build some applications before I was aware of these tricks.

Tip 1: If you are building a 64 bit application, use the OBJECT_MODE environment variable.

The OBJECT_MODE environment variable is understood by a number of AIX applications such as (xlc, xlf, ar, nm, etc). Set the OBJECT_MODE environment variable before running configure:

e.g

   iceberg2 1% export OBJECT_MODE=64
   iceberg2 2% ./configure
   

Be aware that using 64 bit addressing does not alter the size of REAL or INTEGER values (See tip 3). However, for C programs -q64 does change the size of "long" values and also increases the size of pointers from 32 bits to 64 bits.

Tip 2: Use MP_HOSTFILE environment variable or Loadleveler when running configure scripts for application which require MPI compilers.

When using MPI compilers during the configure process, it's not uncommon for the configure script to attempt to run the executable generated by the compiler to make sure the compiler works. However, MPI applications need to know, at a minimum, which host to run on. Without a hostfile, you might get an error like this:


    
   iceberg2 3% export CC=mpcc_r
   iceberg2 4% export F77=mpxlf_r
   iceberg2 5% ./configure --enable-mpi
   checking for a BSD-compatible install... ./install-sh -c
   ...
   ...
   checking for C compiler default output... a.out
   checking whether the C compiler works... configure: error: cannot run C compiled programs.

A simple way to get around this error to create a hostfile and set the environment variable MP_HOSTFILE to point to that file.

e.g.

   iceberg2 6% cat > host << EOF
   `hostname`
   EOF
   iceberg2 7% cat host
   b1n1
   iceberg2 8% export MP_HOSTFILE=`pwd`/host

With the variable MP_HOSTFILE set, the executable created by the configure script will now run.

Alternatively, if the compile process takes a long time or large numbers of processors, the configure script can be run through loadleveler. e.g.

   
   iceberg2 9% cat build.ll
   #!/bin/bash
   # @ error   = $(executable).$(jobid).err 
   # @ output  = $(executable).$(jobid).out 
   # @ notification  = never
   # @ job_type = parallel
   # @ node = 2   
   # @ tasks_per_node = 8 
   # @ network.MPI = sn_all,shared,us
   # @ node_usage = not_shared
   # @ class = standard 
   # @ wall_clock_limit=3600
   # @ queue
   
   export CC=mpcc_r
   export F77=mpxlf_r
   ./configure --enable-mpi 

 exit
   gmake 

Tip 3: The -q64 flag and OBJECT_MODE environment variable have no effect on the size of REAL or INTEGER values in Fortran programs.

The -qrealsize and -qintsize compiler options can be used to promote REAL and INTEGER values to 64 bits. Keep in mind that these flags will not affect the precision of explicitly typed variables (e.g. a REAL*4 will not be promoted to REAL*8 by using the -qrealsize=8 flag).

e.g.

   iceberg2 10% xlf90 -qrealsize=8 check_sys.f90 -o check_sys
   iceberg2 11% more check_sys.f90 
   program check_sys
       IMPLICIT NONE
       REAL d
       REAL*4 e
       REAL*8 f
       REAL*16 g
       print *,"Floating Point Size:"
       print *, " sizeof(REAL) =          ", SIZEOF(d)
       print *, " sizeof(REAL*4) =        ", SIZEOF(e)
       print *, " sizeof(REAL*8) =        ", SIZEOF(f)
       print *, " sizeof(REAL*16) =       ", SIZEOF(g)

   end program
  
   iceberg2 12% ./check_sys 
   Floating Point Size:
     sizeof(REAL) =           8
     sizeof(REAL*4) =         4
     sizeof(REAL*8) =         8
     sizeof(REAL*16) =        16

Tip 4: Use the -WF,-Ddefn Fortran Option to set C preprocessor definitions on the command line.

Many Fortran codes use preprocessor directives for conditional compilation of code. If a preprocessor macro is defined on the command line it must be preceded by "-WF"


   e.g.
   iceberg2 13% xlf90 -WF,-DMYMACRO=2.5 preproc.F90 -o preproc

This replaces MYMACRO throughout preproc.F90 with 2.5.

Tip 5: Be aware that using the IBM MPI compilers for linking of non-MPI applications adds parallel operating environment (poe) runtime requirements for the executables created.

It may seem easier to use the MPI compilers to build an executable in some cases, however be aware that executables created with the IBM MPI compilers will require a hostfile to run even if the application is serial.


   e.g.    
   iceberg2 14% mpxlf90 preproc.o -o preproc
   iceberg2 15% ./preproc 
   ERROR: 0031-808  Hostfile or pool must be used to request nodes
 
   Codes linked with the non-MPI compilers do not have this requirement.
   
   iceberg2 16% xlf90 preproc.o -o preproc
   iceberg2 16% ./preproc 
    Hello World

Spring Training at ARSC

It has been cold again this week in Fairbanks, so it seems odd to be talking about Spring, but it's nearly here. Once again, ARSC has a variety of training opportunities available. Registration is required for all classes.

Title: Emacs as a powerful Integrated Development Environment Date: Friday March 2, 2007 Time: 1:00 - 5:00 PM Location: ARSC Classroom, WRRB 009 Instructor: Anton Kulchitsky Title: The basics of MATLAB Date: Friday March 9, 2007 Time: 1:00 - 5:00 PM Location: ARSC Classroom, WRRB 009 Instructor: Chris Fallen Title: Introduction to IDL (Interactive Data Language) Date: Friday March 23, 2007 Time: 1:00 - 5:00 PM Location: ARSC Classroom, WRRB 009 Instructor: Sergei Maurits Title: Introduction to IDL iTools Date: Friday March 30, 2007 Time: 1:00 - 5:00 PM Location: ARSC Classroom, WRRB 009 Instructor: Sergei Maurits Title: Using MEX and PDE in MATLAB Date: Friday April 6, 2007 Time: 1:00 - 5:00 PM Location: ARSC Classroom, WRRB 009 Instructor: Chris Fallen

For a complete description of these training opportunities along with registration information see:

    http://www.arsc.edu/support/training/SpringTraining2007.html

Quick-Tip Q & A


A:[[ I'd like to pipe "ls -l" into "cut" to extract certain fields,
  [[ like group, size, and name, and I'd like to use spaces as delimiters,
  [[ because it's easer to count the fields.  So with this "ls -l" output:
  [[
  [[  -rw-------  1 robert jstme  3183 2007-01-23 14:57 file1
  [[  -rw-------  1 robert  them   973 2006-09-15 08:19 file3
  [[  -rw-------  5   bert justm 15096 2006-11-16 13:04 file2
  [[
  [[ I tried:  
  [[  ls -l 
 cut -f4,5,8 -d' '
  [[
  [[ but it gave me this garbage:
  [[  robert jstme 2007-01-23
  [[  robert  
  [[    15096
  [[
  [[ Am I doing something wrong?

# 
# Holy Toledo, 11 Answers!  Who needs "cut" when you've got awk,
# perl, uals, stat, tr, and read.
# 
# Thanks to all who replied, and forgive me for trimming answers:
# 

#
# Sean Ziegeler  [ awk ]
# 

Unfortunately, cut treats multiple spaces as multiple delimiters and
therefore counts different numbers of columns.  I am not sure how force
cut to interpret several spaces as a single separator.  I prefer to
use awk to extract a column of information.  It automatically groups
all whitespace as a single separator (unless told to do otherwise).
The appropriate command for you in awk would be:

  ls -l 
 awk '{print $4,$5,$8}'

Awk has innumerable other capabilities and I would highly recommending
googling for tutorials (there are several) or getting a book.  You may
be surprised at how many other useful things it can do for you.


#
# Bill Homer [ perl or tr ]
#

  % ls -l 
 tr -s ' ' 
 cut -f4,5,8 -d' '
  jstme 3183 file1
  them 973 file3
  justm 15096 file2

Or, to line up the columns, use perl to split the lines and print the
desired fields (counting from 0 instead of 1) using a format:

  % ls -l 
 \
    perl -nle '@x=split; printf "%20s%10s  %s\n",$x[3],$x[4],$x[7];'
               jstme      3183  file1
                them       973  file3
               justm     15096  file2

#
# Ryan Czerwiec  [ awk ]
#

[ ...trim... ]
Some systems have an improved version 'nawk,' and some gnu-based
systems call it 'gawk.'  In any case, there should be some type of
'awk' in /bin/.  Then, just use:

  ls -l 
 nawk '{print $2,$5,$7}'

for example to print the 2nd, 5th, and 7th fields.  If a field you
specified doesn't exist, you don't get an error; it just leaves that
part blank.


#
# Kurt Carlson  [ uals ]
#

First off, you are playing with fire expecting 'ls -l' fields to be
consistent.  Above looks like it came from Linux, 'ls -l' which allows
date formatting, from most UNIX implementations this would look like
(or worse from older implementations):

  -rw-r-----   1 kcarlson mygrp         15189 Jan 23 13:54 Makefile.in
  -rw-r--r--   1 kcarlson mygrp          1438 May 14 2006  abend.c

Note that has two different date formats, files older than 6 months and
files newer.  What a pain for parsing, especially if you want to parse
off file date.  This exact problem resulted in 'uals', a University of
Alaska gpl extension of ls over a decade ago.  You want group, size,
and filename above:

  # uals --fields gsf
  mygrp       15189 Makefile.in
  mygrp        1438 abend.c

There are many extensions in uals and it is installed on all ARSC
systems under /usr/local/adm/bin/uals.  For details, see

  man -M /usr/local/adm/man uals
  
http://people.arsc.edu/~kcarlson/software/man/uals.html
ftp://ftp.alaska.edu/pub/sois/


You can format the date, get both access and modify date at the same
time, request size in human readable form (k or K), run recursively and
get full path to file, filter for files (name, newer, older, ...), ...
[ ...trim... ]


#
# Brad Havel [ awk ]
#

It sounds like "awk" might be more useful for the parsing than cut.


  [=> ls -l 
 awk -e '{ printf "%s %s %s\n",$4,$5,$8 }'

  mygrp 553 16:29
  mygrp 512 11:08
  mygrp 386150912 11:01
  mygrp 7680 15:35
  mygrp 56 12:13
  mygrp 307 08:21
  mygrp 512 20:42
  mygrp 3496 12:20
  mygrp 512 13:25


Also, with awk and the printf formatting you can get fancier on the
output, such as "0" padded numerics, the number of significant digits
on floats, and justified strings.


#
# Erick Winston  [ awk ]
#

Try:

  ls -l 
 awk '{print $4"\t"$5"\t"$8}'


#
# Jim Williams  [ tr ]
#
Because ls -l output is formatted for readability, the columns are separated by
a variable number of spaces.  The cut command isn't smart enough to view
multiple contiguous occurrence of the delimiter as a single occurrence.  The 
'tr -s' command can be used to squeeze the white space into a single
character.  This should be done before piping into cut:

  $ ls -l
  total 4
  -rw-------  1 usernm   mygrp 20 2007-01-29 09:55 file1
  -rw-------  1 usernm   mygrp  0 2007-01-29 09:29 file2
  -rw-------  1 usernm   mygrp  0 2007-01-29 09:29 file3
  $ ls -l 
 tr -s ' ' 
 cut -f4,5,8 -d' '

  mygrp 20 file1
  mygrp 0 file2
  mygrp 0 file3
  $


#
# Derek Bastille  [ stat ]
#

[ ...snip... ]  not all unices format the dates like your example.
In OS X, there are no spaces between the date parts, so field 8 is not
the file name.  So, really, using spaces can lead to lots of problems.
Thus, it may be safer to use the stat command to show the details
about the file.  For example:

The information via ls:

  $ /scratch >ls -l
  total 833296
  -rw-r-----   1 usernm    mygrp  426385408 Jan  7 21:38 ADZE_Jobs_Data.fp7
  drwxr-xr-x   2 usernm    mygrp         68 Jun 23  2006 CCAC Documents
  -rw-r--r--   1 usernm    mygrp      35066 Dec  5 09:24 Nov-EF.pdf
  -rw-r--r--   1 usernm    mygrp       4108 Oct  9 13:31 Untitled.tab
  -rw-r--r--   1 usernm    mygrp      55808 Jan  7 19:19 Untitled.xls
  -rw-r--r--   1 usernm    mygrp     155648 Jan 12 17:59 passinv20061103.fp7
  -rw-r--r--   1 usernm    mygrp       3259 Dec 12 16:13 show_usage.php

using stat with a format specifier to get group, size and name:

  $ /scratch >stat -f "%Sg %Uz %SN" *
  mygrp 426385408 ADZE_Jobs_Data.fp7
  mygrp 68 CCAC Documents
  mygrp 35066 Nov-EF.pdf
  mygrp 4108 Untitled.tab
  mygrp 55808 Untitled.xls
  mygrp 155648 passinv20061103.fp7
  mygrp 3259 show_usage.php

Lots of other options are available with stat including things that
you cannot get with the ls command.


#
# Rich Griswold  [ perl ]
#

cut is treating strings of spaces as a multiple delimiters instead of
a single delimiter.  Instead of using cut, you can use awk:

  ls -l 
 awk '{print $4,$5,$9 }'

This doesn't handle filenames with space,s though, which are common
in the Windows and Mac worlds.  You could use this (replacing ... with
as many columns as you need):

  ls -l 
 awk '{print $4,$5,$9,$10,$11,$12,... }'

but this doesn't correctly handle filenames that have two or more
spaces in a row.  The following appears to work in bash and zsh:

  ls -l 
 while read perm links owner group size t1 t2 t3 name;
    do
    echo $group $size $name
    done

However, in bash it doesn't correctly handle filenames that have two
or spaces in a row.  The following seems to work in all cases:

  ls -l 
 perl -e   \
    'foreach(<>){chomp;@f=split/ +/,$_,9;print"$f[3] $f[4] $f[8]\n"}'



#
# Alec Bennett  [ perl ]
#

  ls -l 
 perl -pi -e 's/\s+/ /g;s/$/\n/' 
 cut -f4,5,9 -d" "



#
# Don Bahls  [ read ]
#

I believe cut tokenizes on each space.  You can handle this with
"tr", but generally in cases like this, I use the "read" command in
bash (or ksh).   Which splits strings nicely regardless of how much
whitespace is present.

  nelchina% cat ls_info
  -rw-------  1 robert jstme  3183 2007-01-23 14:57 file1
  -rw-------  1 robert  them   973 2006-09-15 08:19 file3
  -rw-------  5   bert justm 15096 2006-11-16 13:04 file2

  nelchina% cat ls_info 
 while read f1 f2 f3 f4 f5 f6 f7 f8; do echo $f4 $f5 $f8; done
  jstme 3183 file1
  them 973 file3
  justm 15096 file2

Here each string is set to a variable. $f1 is the first string, $f2
is the second string, etc. The read command is probably overkill for
this simple example, but depending on what you want to do with these
fields could be a useful alternative to cut.


### 
### BONUS ANWER to "safe rm *" question. From Orion Lawlor: 
### 
As a former Mac user I like using a little 'rm' script like this:

#!/bin/sh
# Safe remove: move files to ~/trashcan directory.

target="$HOME/trashcan/"`date '+%Y_%m_%d__%k_%M_%S'`
mkdir -p "$target"
mv -i "$@" "$target"


This allows me to undo removals by just finding the appropriate
directory under ~/trashcan, and moving the files back out.  You can
even write a cron job to clean out the trashcan once a week.  As a
bonus, moving a big directory (within the same filesystem) is faster
than deleting it!



Q: I've just extracted a tar file into my home directory.  It turns
out the contents of the tar file were not contained within a single
directory, so now I have a bunch of random files and subdirectories
scattered all over my home directory amidst other, unrelated files.
How can I sort the recently-untarred files from the rest and move
them into a new subdirectory?  Sorting by modification date doesn't
work since the tar file preserved timestamps.

[[ Answers, Questions, and Tips Graciously Accepted ]]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top