ARSC HPC Users' Newsletter 421, November 11, 2011

ARSC at SC11

The Arctic Region Supercomputing Center will once again be part of the exhibit floor at the annual IEEE/ACM Supercomputing conference, in Seattle from November 12-18.  Visit http://sc11.supercomputing.org for conference details.

ARSC will be in booth #2400.  Our booth this year will feature a rotation of videos on various topics.  Of particular interest is a new series of videos that highlight science in Alaska, with a wide range of field research from archeology to grizzly bears.  These videos were made through a partnership with www.frontierscientists.com, with support from the National Science Foundation.

Visit the booth and you can pick up a reusable hand warmer and other tchotchkes.  The hand warmers have a small metallic disk that creates a chemical reaction, producing warmth for those cold Arctic nights of standing on the back of your dog sled.  After use, boil them in water to reset them for another use.

Also in the booth will be limited quantities of the 2012 ARSC Calendar.  New photos this year include scenes from Alaska's great outdoors.

Labeling MPI Output

[By: Don Bahls]

A few months ago a user asked me if it was possible to label the output from each MPI tasks with a task number.   The code that was being ported had been developed on an IBM AIX system running poe, which conveniently offers the MPI_LABELIO environment variable to enable task labeling for
 each MPI task.     

Reviewing the OpenMPI man page I couldn't find an obvious way to label the output from each MPI task in that environment, so I took a stab a writing a simple Perl script to intercept the stdout from each task and label it using the task number from the environment variable set by OpenMPI for each task.

Here's the Perl script:

pacman % more labelio
#!/usr/bin/perl
#
# Script:  labelio
#
# Purpose: Labels stdout with MPI rank
#
# get the rank value from the OMPI_COMM_WORLD_RANK environment variable
# this should work for OpenMPI.
my $rank;
$rank=$ENV{OMPI_COMM_WORLD_RANK} if ( $ENV{OMPI_COMM_WORLD_RANK} ne '' );

if ( scalar(@ARGV) == 0 and ( $rank == 0 or $rank eq '' ) ) {
   print STDERR "$0 exe [args]\n";
   print STDERR "Purpose:\n";
   print STDERR "   Prepends the MPI rank to stdout and/or stderr of an MPI task.\n";
   print STDERR "\n";
   print STDERR "Example Usage:\n";
   print STDERR "  Example 1:  Add labels for MPI ranks to stdout only\n";
   print STDERR "    mpirun -np 16 $0 ./a.out\n";
   exit(0);
}

my $arg; my $cmd;
# read the command line to get the command line to run.
foreach $arg (@ARGV) { $cmd .="$arg "; }
open(CMD, "$cmd | ") or die "Could not open command = $cmd\n";

while ( <CMD> )
{
   my $line=$_; chomp($line);
   printf "%03d:: %s\n", $rank, $line;
}

Here's an example of the normal output from a hello world program.  This particular program includes the node and the task number for each task, but you could imagine it being challenging to tell which task wrote which output for a larger application.

n70 % mpirun -np 4 ./hello
n70: hello world- 0
n70: hello world- 1
n70: hello world- 2
n70: hello world- 3

Here's a second run including the "labelio" script above.  Notice the output from each task is now prepended with the task.

n70 % mpirun -np 4 ./labelio ./hello
000:: n70: hello world- 0
002:: n70: hello world- 2
003:: n70: hello world- 3
001:: n70: hello world- 1

The script above labels only stdout.  It's possible to label stderr by redirecting stderr to stdout on the open command:

 

open(CMD, "$cmd 2>&1 | ") or die "Could not open command = $cmd\n";

We found this to be a convenient way to get additional diagnostics from this particular program without recompiling.

Tripole Grid Tales

[by Kate Hedstrom]

In the world of climate modeling, one builds a model of the atmosphere to cover the whole globe. The atmospheric scientists have long had to think about the "pole problem". If you build a grid that's uniform in latitude/longitude space, how do you deal with the poles? One solution is spherical harmonics, another is a stretched cube made of six square patches. There have been others I know less about.

The ocean modelers have taken advantage of the continents to simply place one pole in Antarctica and the other pole in Greenland. If the grid spacing gets finer near Greenland, that's OK. Well, it was OK for a while, but then as the average resolution got better, it got just a bit too fine off Greenland.  That's when someone came up with the tripole grid, with the usual pole in Antarctica, a pole in Russia, and a pole in North America. The grid comes north in the usual way, then meets in a sort of zipper across the north pole.  This allows for a reasonably uniform grid across the Arctic Ocean.

My colleagues and I are in the process of setting up a global run using a model which has traditionally been considered to be a regional model. In going to a global domain, we have chosen to try a tripole grid. Some fun aspects involve having different fields in the model at different places on the grid:
 

     P------V------P
     |             |
     |             |
     U      T      U
     |             |
     |             |
     P------V------P

I would have put the pole at a P-point, but no, we obtained one with the northern poles at T-points. What does it matter, you ask? We didn't want to redo the land mask, where the t-point mask is king, making the whole square into a land cell. Actually, I would have been wrong placing it at the P-point because on our grid, the northernmost P-point is part of the boundary condition and neither side would have been timestepping it. As it is, both sides are timestepping the northern-most (overlapping) T-point. They need to communicate to ensure that both sides have the same answer.

The next bit of fun involves parallel programming. If all of the northern points are on one process, communication is pretty straightforward. If you want to run on many processes, it makes sense to tile in both horizontal directions. I'm going to insist on an even number of tiles around the equator, half being on one side of the zipper, half on the other.

On the face of it, the communication across the zipper isn't so very different from any other tile boundary. In fact, the communication is just a little different for each of the P, T, U, and V points and has to be handled separately. Also, a positive U on one side is heading from Russia to Canada, but needs to be passed as a negative U to the other side. The V values also get rotated, with positive flow on one side being from the Bering to Greenland and the other side considering that as a negative V. However, not everything on a U point is a velocity.

We're still working bugs out in this beast....
 

Cutting and Pasting with Google Docs

[ By: Ed Kornkven ]

I enjoy have two workstations on my desk.  One is a MacBook Pro laptop that gets a lot of heavy use and the other a Linux box made by, let's see, Penguin.  Both have decent-sized monitors and I've settled into a rhythm of purposes for each machine that works pretty well for me.  Except... that occasional moment when some tidbit on my Mac screen would be really handy to have over on my Penguin screen.  That copy-and-paste operation that has become such an essential part of the interaction of man and machine is nullified by little things like architecture and data paths.

Reclaiming that capability in my environment is important enough to me that I once cobbled together some cooperating scripts that would copy the clipboard into a file, do a remote copy to the other machine where I would run another script to copy into its clipboard.  Yeah, I know -- ugh.

Imagine my simple delight then, when I discovered that Google Docs provides a workaround that perhaps doesn't rise to the level of elegance, but is quite usable, free, and requires no additional software beyond a web browser.  The key capability it offers is maintaining transparent consistency between two views of a document.  Here is how it works for me.

I have a Google Doc that I cleverly named "Clipboard".  That document can be opened more than once -- say, on my Mac and on my Linux box.  And whatever is typed into one window shows up on the other, pretty much instantly.  So I leave a "clipboard" browser window open on each machine and copy-and-paste to my hearts content.  Whatever I put on one machine appears on the other, and I can have clipboard history as well, no extra charge.

Quick-Tip Q & A

A:[[ I have some Fortran 90 code I want to call from inside Python.
  [[ I have a setup.py file that can be used to build it, but I need to
  [[ invoke it with "python setup.py build --fcompiler=gnu95". Is there
  [[ any way to put the fcompiler option into the setup.py file, at least
  [[ as a default?

No answers were forthcoming. We put the need for --fcompiler=gnu95 into the README file.

Q: I need to copy some large files between computers.  What is the fastest way to do that?


[[ Answers, Questions, and Tips Graciously Accepted ]]


E-mail Subscriptions:

Archives:

    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top