ARSC HPC Users' Newsletter 316, May 20, 2005

ARSC Summer tours: Wednesdays at 1 pm

ARSC Summer Tours start on June 1st. They are schedulded for every Wednesday at 1pm, June 1 - Aug 31.

If you're visiting Fairbanks this summer, you might add us to your list of destinations. If you live here, wait for a rainy day and bring your family!

The tours are actually "virtual tours", held in the Discovery Lab (which is in the Rasmusen Library). For more information:

http://www.arsc.edu/news/summer_tours.html

"First Friday" Art Walk to include ARSC DLab

As in communities around the country, many Fairbanks art galleries and museums stay open after-hours the first Friday of every month. Instead of "bar hopping," art lovers "gallery hop."

You may add ARSC's Discovery Lab to your list of "galleries," the first Friday in June to experience the latest in 3D immersive computer art and music.

First Friday Art Walk Miho Aoki, ARSC/Art Joint Appointee Bill Brody, ARSC Visualization Specialist 5-8 p.m. Friday, June 3, 2005 375C Rasmuson ARSC Discovery Lab

Gnu Make, Part III of III

[ Many thanks to Kate Hedstrom for the culmination of the gmake series.]

In our journey with make, one friend sent me to this web site:

http://make.paulandlesley.org/rules.html

It contains Paul Smith's Rules of Makefiles. He also has these web pages:

http://make.paulandlesley.org/vpath.html http://make.paulandlesley.org/multi-arch.html

Last time we covered using conditionals and include files to clean up our Makefile. So far we have been assuming that everything is in one directory. Over time, our code has grown to encompass over a hundred source files. This was getting unwieldy, so the time has finally come to put things in subdirectories. As in many things, there's more than one way to do it. Some of the choices include:

  1. recursive versus nonrecursive make?
  2. the destination of object files: in the source directories, the top directory, or into a build directory?

We have chosen to go with nonrecursive make (for discussion, see http://aegis.sourceforge.net/auug97.pdf) and to honor Paul's third rule--life is simpler if the objects go into the current (top) directory.

Some Details

The top directory has the master Makefile, which includes Makefile instructions from each subdirectory, in files called Module.mk. The top Makefile starts with an empty list of sources. In each subdirectory, we find local sources by simply listing all the .F files and appending this to the master list:

in Makefile:


sources :=
include somedir/Module.mk

in somedir/Module.mk:


local_src  := $(wildcard $(subdirectory)/*.F)
sources    += $(local_src)

Here, we are using the wildcard function to search the subdirectory for it's local sources. Each subdirectory resets the local_src variable, but that's OK because we're saving the values in the global sources variable. The other sneaky thing here is the user-defined subdirectory function, from the book Managing Projects with Gnu Make by Robert Mecklenburg:


subdirectory  = $(patsubst %/Module.mk,%, \
                $(word $(words $(MAKEFILE_LIST)),$(MAKEFILE_LIST)))

This does the right thing to figure out which subdirectory we are in from make's internal list of the Makefiles it is parsing. It depends on all the subdirectory include files being called Module.mk.

Library Details

The directory structure we are using has the top directory, an include directory, several directories which contain sources for libraries, and the directory for the main program. There is also a directory for the compiler-specific Makefile components, which we talked about last time.

Here is a complete example of a library Makefile component:


local_lib  := libNLM.a
local_src  := $(wildcard $(subdirectory)/*.F)
path_srcs  += $(local_src)

local_src  := $(patsubst $(subdirectory)/%.F,%.F,$(local_src))
local_objs := $(subst .F,.o,$(local_src))

libraries += $(local_lib)
sources   += $(local_src)

$(local_lib): $(local_objs)
        $(AR) $(ARFLAGS) $@ $^

The only thing that changes from one to the next is the name of the library to build. I'm actually keeping track of the sources with and without the subdirectory part of their name. The objects will go into the top directory, so they shouldn't have the directory in their list. I only need the path_srcs for creating the dependency information; make itself knows to look in the directories because of a vpath command. We are also updating a libraries variable, adding the local library to the global list.

Main Program

The main program is in a directory called Drivers and its Module.mk is similar to the library one:


local_src  := $(wildcard $(subdirectory)/*.F)
path_srcs  += $(local_src)

local_src  := $(patsubst $(subdirectory)/%.F,%.F,$(local_src))
local_objs := $(subst .F,.o,$(local_src))

sources    += $(local_src)

$(BIN): $(libraries) $(local_objs)
        $(LD) $(FFLAGS) $(LDFLAGS) $(local_objs) -o $@ $(libraries) $(LIBS)

Instead of a rule for building a library, we have a rule for building a binary. In this case, the name of the binary depends on if it's parallel or not and is defined elsewhere. The binary depends on the libraries getting compiled first, as well as the local sources. During the link, the $(libraries) are compiled from the sources in the other directories, while $(LIBS) are exteral libraries such as NetCDF and mpich.

Top Level Makefile

Now we get to the glue that holds it all together.

First, initialize some of the global lists (Note that, depending on your build process, you may not want to copy this clean_list, which includes "*.f90"! ):


#-------------------------
#  Initialize some things:
#-------------------------

    clean_list := core *.o *.mod *.f90 lib*.a
    sources    := 
    path_srcs  := 
    libraries  :=

    objects = $(subst .F,.o,$(sources)

Second, define the subdirectory function (as presented above):


subdirectory = $(patsubst %/Module.mk,%, \
                  $(word                \
                     $(words $(MAKEFILE_LIST)),$(MAKEFILE_LIST)))

Third, as discussed in the previous issue, add the user-defined switches and the compiler-dependent includes. Then we have the pattern rules we described in Part I. Finally we get to the meat of the includes:


.PHONY: all

all: $(BIN)

modules  := Adjoint Ice Modules Nonlinear Representer \
                Support Tangent Utility Drivers

includes := Include Adjoint Nonlinear Tangent Drivers

vpath %.F $(modules)
vpath %.h $(includes)

include $(addsuffix /Module.mk,$(modules))

CPPFLAGS += $(patsubst %,-I%,$(includes))

.PHONY: clean

clean:
        $(RM) $(clean_list) $(BIN)

"all" is the first target that gets seen by make, making it the default target. In this case, we know there is only the one binary, whose name we know - the book shows how to do more than one binary. The modules are the list of subdirectories containing a Module.mk we need to include. "clean" is the target that removes all the cruft we don't want to keep. Both "all" and "clean" are phony targets in that no files of those names get generated - make has the .PHONY designation for such targets.

Conclusion

I've presented bits of a make system that works for us in a portable, parallel ocean circulation model written in Fortran. I've probably glossed over things - feel free to contact me for details or clarifications. The Gnu make book has more of a Java slant, plus presents a Makefile for making the book itself.

Specifying Projects in Loadleveler Scripts

ARSC has recently made some changes to the way in which projects are specified in Loadleveler scripts on iceberg and iceflyer. This change allows the Loadleveler keyword "account_no" to be used to specify which project the CPU hours for a job should be charged against. If no "account_no" keyword is specified in a job script, the account number will default to the user's primary project (i.e. primary group).

The syntax is as follows: # @ account_no = projecta

(Note: the "account_no" keyword must appear above the "queue" keyword in the Loadleveler job script.)

Each project is assigned a Unix group when the project is created. The "groups" command will show your available projects and, for some users, additional Unix groups which are not associated with a project. You may only charge against a group name associated with a project, and not against any of the non-project Unix groups.


E.g.:
  iceberg2 1% groups
  projecta projectb projectc thrdprty

In this case, the user could run jobs under any of the first three groups, but not under "thrdprty," which might be a Unix permission group designed to restrict access to some hird-party software product.

Call ARSC consulting (907-450-8602) if you're not sure which groups are associated with projects or if this raises any other questions.

German Shephard Bite Increases Capacity of Compact Flash Disk

ARSC staff member Carol Falcetta discovered that a quick bite from her black shephard, Phoenix, to a compact flash disk increased it's capacity from 256 MB to 2 TB... as reported by Macintosh peripherals.

Unfortunately, the canine upgrade technology also rendered the photographs on the disk unreadable, and the disk, unformattable.

Quick-Tip Q & A



A: [[ I am using Loadleveler and want to get the number of nodes and
   [[ processors that my job requests to set two environment
   [[ variables: NODES and PROCS.  After some searching, I found that
   [[ LoadLeveler sets the environment variable LOADL_PROCESSOR_LIST
   [[ when my job is run.  This lists the node that each task is
   [[ running on.  Below is an example of what I got when running a 9
   [[ task, 3 node job.
   [[
   [[ LOADL_PROCESSOR_LIST=b7n1 b7n1 b7n1 b7n4 b7n4 b7n4 b7n2 b7n2 b7n2
   [[
   [[ There must be a way to get the information I want from this
   [[ list.  Can you help me figure out a way to set the value of NODES
   [[ and PROCS from the value of this environment variable?



#
# Thanks to Derek Bastille of ARSC for a straight forward Korn shell
# solution.
#

In ksh, I can get those values by doing:

PROCS=`echo $LOADL_PROCESSOR_LIST 
 tr ' ' '\n' 
 wc -l`
NODES=`echo $LOADL_PROCESSOR_LIST 
 tr ' ' '\n' 
 sort 
 uniq 
 wc -l`


#
# Thanks to Alan Wallcraft for csh variants and a reminder about the
# finite length of environment variables.
#

If you just wanted the number of processors, the following 1-liner will
do (C-shell):

setenv PROCS `echo $LOADL_PROCESSOR_LIST 
 wc -w`

For the number of nodes as well, use:

echo $LOADL_PROCESSOR_LIST 
 xargs -n 1 >! list_$$
setenv PROCS `cat list_$$ 
 wc -l`
setenv NODES `sort -u list_$$ 
 wc -l`
/bin/rm list_$$

The first line produces a file with one node-name per line and one line
per processor.  If we can guarantee that all references to a single node
are consecutive, then sort -u could be replaced by uniq but sort with
the unique option is safer.

Note that LOADL_PROCESSOR_LIST can fail to include all processors if the
list of processors is too long to be stored as a string in an
environment variable.  The maximum number of processors that can be
handled by LOADL_PROCESSOR_LIST depends on the length of the node
names.  In this example they are very short, but at other sites I have
seen fully qualified domain names in LOADL_PROCESSOR_LIST and this
severely limits the usefulness of LOADL_PROCESSOR_LIST.


#
# And last but not least thanks to Jesse Niles for an obnoxious 
# one-liner which sets both PROCS and NODES.
#

It ain't pretty, but it seems to work (only in sh/ksh/bash):

$(echo $LOADL_PROCESSOR_LIST 
 awk '/ /{ print "export PROCS="NF""; system("echo \"export NODES\=\\c\"; echo "$0" 
 tr \" \" \"\\n\" 
 uniq 
  wc -l 
 sed -e \"s/[[:space:]]//g\""); }')




Q: I always keep two terminal windows open to klondike (or whatever
   remote system I'm using).  

   Are there any tricks for exchanging information between the two
   windows?  For instance, I may type a command in one session, but want
   the same thing to happen in both sessions. E.g., "module switch
   PrgEnv PrgEnv.new" or  "cd ~/blah/blah/blah".  Or I may want to pass
   an environment variable between sessions.

   Yes, I know how to "cut" and "paste" between windows using the
   so-called "clipboard"... but it's a pain in the guru.  Any other
   solutions?

[[ Answers, Questions, and Tips Graciously Accepted ]]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top