ARSC HPC Users' Newsletter 369, September 07, 2007



Multicore and New Processing Technologies Review - Part II

[[ Editors Note:
[[ Part I of this series gave an overview of the ARSC Multicore and New
[[ Processing Technologies Symposium, which was held August 13th and 
[[ 14th.  In Part II, we present the relatively subjective observations 
[[ and recommendations of the attendees.
[[ Part I can be found here:


[ by Greg Newby ]

Symposium attendees agreed that sub-linear performance increases will only worsen as the number of cores continues to grow. Caching, memory access, and many shared resources such as the system bus and interconnect, will not scale as easily as the number of processor cores. This will have major implications for HPC users, who will often need to have a detailed understanding of the processor layout and other system characteristics in order to get maximum performance. Most HPC developers have experienced this before, with technologies such as vector processors, and during transitional periods such as from 32-bit to 64-bit processors.

Today, most HPC applications use MPI to parallelize. Symposium attendees recognized the need for new types of processors to continue to support MPI. At the same time, they see better prospects for programming next-generation systems emerging. The NSF's recent PetaApps grant program, for example, specifies the need for new programming paradigms and languages for using tens or hundreds of thousands of processors for a single job. Several languages are under development, including Unified Parallel C (UPC), Chapel (from Cray), Co-array Fortran, and X10 (from IBM). These allow the programmer to consider a large-scale computational cluster as a more unified system, with shared memory.

Attendees made the following major recommendations, directed at the computing industry, HPC users, and agencies such as the Department of Defense, National Science Foundation, and Department of Energy:

  • Today's large-scale applications will generally be able to function on next-generation computers, but will often operate less efficiently until additional effort is put into adapting the applications to the underlying hardware. We recommend planning take place to evaluate and address the performance gap that will otherwise result if applications are not appropriately tuned.
  • For newly developed applications or major updates to software, we advocate new programming languages and models, notably parallel global address space (PGAS) languages. These will help to hide the complexity of the underlying system from the developer.
  • Lack of uniformity is a major trend in next-generation computers, and we recommend that the HPC community takes this non-uniformity into account in all software planning and projection. Hierarchical memory, processor asymmetry, and bandwidth limitations for interconnects and memory, can have major impacts on performance as applications scale out to more processors in heterogeneous environments. GPUs and Cell processors are stark examples, but multicore CPUs also create asymmetry that needs to be better accounted for.
  • We recommend that double precision computations be re-evaluated where feasible, in order to gain speedup from single precision floating point operations. This is in reaction to the greatly enhanced performance for single precision on FPGAs and GPUs. For some applications, software libraries can use single precision math, yet deliver double precision results of adequate utility.
  • We recommend focusing on sustained FLOPs per Watt as a major criterion for choosing new development platforms. Scaling out to petascale computing will require very large systems. Focusing effort on the utilization of more power-efficient or higher-performing CPUs can have many benefits. Insuring the FLOPs are usable requires additional attention to the programming environment and other software elements of a system.
  • Libraries are key. We recommend concerted focus by funding agencies, vendors and developers on general-purpose software libraries for utilization of new processor technology and heterogeneous computers. These must include BLAS, BLACS and LAPACK, as well as libraries for particular scientific domains. Such libraries are non-trivial to implement, and must make intelligent run-time decisions about when or how to use available hardware elements.
  • Not all of today's HPC applications will scale to new processing technologies and heterogeneous environments. We recommend assessing performance limitations in algorithms, addressing communication overhead, and looking to new application areas -- particularly those for information processing (versus data processing) -- as champion applications.

Additional recommendations include:

  • Encourage changes to operating system, especially the memory interfaces and process schedulers, to make better use of new processing technologies.
  • Adopt or develop benchmarks that will better measure real-world application performance in heterogeneous environments.
  • Track further developments in new processor technologies, and adapt to them as needed. These will include processor-in-memory, larger scale multicore CPUs, different caching methods, in-socket accelerators, and more.
  • Look for ways to develop and increase expertise in computational science and computer engineering that is ready for the increasing heterogeneous world. Provide strong multidisciplinary academic training, and recruit of the best and brightest graduate students.
  • Support development of emulation environments, to be able to better understand application performance on current and next-generation processors.
  • Measure the count of compute operations to I/O, in order to evaluate where 50x or greater speedups are possible with new processor technology. This will help select applications to focus on for acceleration.
  • Creating balanced systems, with attention to programmability, interconnect and memory latency, sufficient memory size, storage, and so forth.

Subversion Usage at ARSC

  [ by Anton Kulchitsky ]

This article will describe:

  1. Why Subversion should be used
  2. Working with Subversion
  3. Setting up a Subversion repository for yourself
  4. Setting up a Subversion repository for an ARSC project

1. Why use it?

Epigraph: "[Use it] Always. Even if you are a single-person team on a one-week project. Even if it's a "throw-away" prototype. Even if the stuff you're working on isn't source code. Make sure that everything is under the source code control - documentation, phone number lists, memos to vendors, makefiles, build and release procedures, that little shell script that burns the CD master - everything. ... Even if we're not working on a project, our day-to-day work is secured in a repository." ("The pragmatic programmer" by Andrew Hunt and David Thomas, Addison-Wesley 1999)

There are many issues that Subversion intends to address.

  • Reliability. How can you keep programs and other documents as safe as possible while not archiving terabytes of automatically generated files? How can you keep the structure of your code clean while not spending much time keeping things in order?
  • Reversion. If you make a change, and discover it's not viable, how can you revert to a version of the code that is known to be good? This is why you have an undo option in your editor. How can you get a proper undo for all documents?
  • Change/Bug Tracking. You know your code has been changed; do you know who, when and why it was done? Or when and where the new bug was introduced?
  • Branches. How can you introduce a completely new feature or concept and not mess up the working code?
  • Merging branches. If you divided the code, how can you merge new code into the old code and not mess it up?

The version control system is more reliable, easy to backup, safer to use and harder to destroy. It also improves self-organization. It helps developers communicate and discuss issues. It can prevent a conflicting access to the same document. It can be integrated into IDE or editor, and so on and so forth.

See HPC Users' Newsletter issue 278 Quick-Tip for more information: > /arsc/support/news/hpcnews/hpcnews358/index.xml#article3 .

2. Working with Subversion.

For details on using Subversion, please see the excellent documentation available online at:

To understand how to work with Subversion, it is most important to understand some basic principles.

  • You have a centralized "repository" with all your files stored there. That repository remembers all changes you have done to the files including any changes in the directory structure or renames.
  • You never work with that repository directly. Instead, you "check out" (create) one or more "working copies", a snapshot of some state of the repository, usually the latest state. You modify files in that working copy and when they are ready, you "commit" your work back to the repository.

Subversion cares about permissions, so you may run into problems if the permissions on the repository were not set up properly (see below).

When you make changes to your working copy they are not available for any other working copies (either yours or others) unless you successfully commit changes to the repository. You and others can simultaneously work on the same code in different working copies. Thus, you may even make changes on the same file. Usually these changes do not conflict and Subversion can automatically merge such changes. If your changes do conflict with those in the repository, Subversion will not allow you to commit changes unless you resolve this conflict.

In some cases you may want to "lock" a file you are working on, so subversion will not allow anyone to commit a modification to that file unless you "unlock" it again. You should do locking only for binary images you are changing because it can be very hard to resolve conflicts with them.

Here are some basic commands you need to know when using Subversion:

  • svn help - Lists all available svn commands.
  • svn help COMMAND - Provides a description of the specified command.
  • svn import PATH URL - Recursively commit an unversioned path to the destination URL. The destination is an svn URL that can, at ARSC, have the following forms: svn+ssh:// if the path to the repository is not mounted on the current machine but mounted on file://path_to_the_repository if the repository available as a local directory to you. For example, if you have your repository svnrepo in $ARCHIVE, you can reach it by setting the URL to file://$ARCHIVE/svnrepo
  • svn checkout URL - Initial checkout of an svn repository from URL (a creation of a new working copy). You need URL only for the checkout command. Subversion will remember this URL and other commands will use the same repository if they need to contact it.
  • svn status - Provides a status overview of the local repository - lists added, modified, deleted and unknown files. The status of a working directory in subversion is displayed without accessing the repository (unless you specify the URL). Local changes are displayed in five columns, with the first one indicating the status:
     '' No changes.
     'A' Object is marked for addition.
     'D' Object is marked for deletion.
     'M' Object was modified.
     'C' Object is in conflict.
     'I' Object was ignored.
     '?' Object is not being maintained by versioning control.
     '!' Object is reported missing. This flag appears when the object was deleted or moved without the svn command.
      '~' Object was being maintained as a file but has since been replaced by a directory or the opposite has occurred.
  • svn update - Updates the local working copy to the latest version of the repository.
  • svn add PATH... - Adds the specified path to the repository with all files under it recursively - will be transferred to the server at the next time svn commit is called.
  • svn rm PATH... - Each item specified by a PATH is scheduled for deletion upon the next commit. Files, and directories that have not been committed, are immediately removed from the working copy. PATHs that are, or contain, unversioned or modified items will not be removed unless the --force option is given.
  • svn mv FROM TO - Renames/relocates a file/directory from FROM to TO.
  • svn revert FILE... - Removes any changes on the specified files performed since the last "svn update" (or "svn checkout").
  • svn commit - Commit in the sources from the working copy to the repository.

    Please follow the principles provided below for committing to the repository:

    Specify an appropriate log message every time you commit so that other developers (including you later) know what you have changed.

    Before you commit resources into a repository, run "svn update" and check whether everything still works as expected, cleanly compiles your sources and the test cases execute without errors. If some new code does not compile but you need to commit it anyway, consider using branching (see documentation).

  • svn resolved FILE - Telling Subversion that the conflict was successfully resolved and file can be committed now.
  • svn diff - Display the differences between two versions - the modifications since the last "svn update" by default.
  • svn cleanup - Recursively cleans up the working copy, removing locks, resuming unfinished operations, etc. Call this if requested by the output of another svn command or if you are uncertain whether your working copy is clean. You also might appreciate the option -m for commands to set up a log message. Otherwise Subversion will run your editor to allow you to type the log message which may be a little bit annoying. When you quit from the editor, the operation will continue. If you accidentally deleted the file, simply run "svn update filename", where filename is optional. All accidentally deleted files will be restored.

The structure of your repository can be as complex as you like. However, there are some standard Subversion conventions you should follow. Every sub-project in your repository should first have three subdirectories: trunk, branches, and tags. Trunk is the main developmental line. Branches are some extra copies of your sub-project that reflect usually some big new features or big modifications to the main line. Tags are usually related to main releases. Subversion smartly does not save every copy but rather stores and manages changes between copies. Thus, creating a new branch or tag does not actually duplicate the code.

Suppose your project OUR_PROJECT has a website, a model, utilities, and scripts associated with it. In this case your repository structure can be the following:



It is very easy to start to work with Subversion and get many important benefits right away. However, for even more benefits it is worth reading the SVN Book and start using more advanced features of Subversion. Please consider branching and propset as next topics after you became familiar with basic commands.

3. Setting up Subversion for yourself

Epigraph: "I always work in a team of at least two - myself now, and myself later. This other guy is often the hardest to work with - I can't convince him that he doesn't know I'm talking about." (Jaime Metcher, 20/07/07 01:15, somewhere on the Internet)

The way to set up Subversion only for your personal work at ARSC is to use the $ARCHIVE directory. It is highly reliable storage with backups. Thus, it is probably the best place to keep your repository, isn't it? To set it up:

  • Create a repository: svnadmin create $ARCHIVE/myrepository --fs-type fsfs Replace "myrepository" to anything you prefer.
  • Import your files into your repository. Be sure you are not importing automatically generated files, backup files and so on. You need to control versions of only text files and binaries that you create, not compiler or model output, etc. Suppose you would like to import your program that you keep in directory Program1. Then you might want to copy the directory Program1 to some temporary directory, clean all generated files leaving only the program, documentation, logo, or anything else that you created and type: svn import file://$ARCHIVE/myrepository/temporary_Program1/trunk -m "Initial import" The last message will be your first log entry for every file.
  • Check out your first working copy. You might want to do this in $WRKDIR. There is no danger anymore that your program can be lost, so $WRKDIR is a good place for doing this.
     cd $WRKDIR
     svn checkout file://$ARCHIVE/myrepository/Program1
    You will get the working copy of you program. Please, read the previous section and Subversion documentation to see how to work with it. Check your working copy carefully to verify you imported everything you need. Then you can delete both Program1 (you still better backup it, just in case you forgot to import something) and that temporary directory.

Setting up the Subversion for an ARSC project.

Suppose you need a repository for your project. In this case you should do the following.

  • By request of the project PI, ARSC can create a project directory, /projects, e.g., for the project "MY_PROJECT," the directory would be /projects/MY_PROJECT. Its owner will be the project PI and its group, the project group. Only members of the project group will be able to work with the repository. For a small project, you also may create a repository in your $ARCHIVE directory and follow the steps below. You need to run the following commands on the machine where /projects/MY_PROJECT directory is mounted locally.
  • Create a repository svnadmin create /projects/MY_PROJECT/svnrepo --fs-type fsfs
  • Change the group of your repository to your common group (YOUR_COMMON_GROUP in the example below) which may not be a primary group. chgrp -R YOUR_COMMON_GROUP /projects/MY_PROJECT/svnrepo
  • Change the permissions of the repository so your group has the same permissions as you:

    for perm in 4 6 7;
             find /projects/MY_PROJECT/svnrepo -perm ${perm}00
                             -exec chmod ${perm}${perm}0 {} \;
  • Set a special bit for all directories so all new subdirectories and files will automatically belong to the appropriate group: find /projects/MY_PROJECT/svnrepo -type d -exec chmod g+s {} \; ** WARNING: It's against ARSC's security policy to set the SGID bit ** on any file. You can set it on a directory as described here, but ** don't set it on anything else.

All developers from the group are ready to work with the repository now. Developers can now import files (see above).

The project directory might not be available from all systems as $ARCHIVE. In this case, you need to use svn+ssh type URLs to access it.


Introduction to strace, Part II

  [ by Craig Stephenson ]

In Part I of Introduction to strace, I used several examples to show how strace can help locate some of the more elusive problems you might encounter during program execution.

> /arsc/support/news/hpcnews/hpcnews361/index.xml#article1

All of the examples used in Part I dealt with diagnosing problems in the execution of a serial code. Chances are this tool would be just as useful, if not more, to diagnose problems in the execution of parallel code. This article illustrates how to use strace with MPI, as well as how to use strace to identify bottlenecks in a code.

To many of us, it is a bit of a mystery what happens to our jobs once they are submitted to the internal nodes of a cluster. Suppose you submit a job to run an executable that has already worked several times in the past. The executable has not changed, but for one reason or another, your new job ends prematurely and produces no useful output. What can you do? You can start by trying to piece together the implications of the vague, often ambiguous, output for the job. Unfortunately, this may not be sufficient to diagnose the problem.

You already know that the problem does not originate in your code because your code has already proved successful during previous runs. But don't despair... you have just stumbled upon an ideal use for strace!

strace can be used to monitor MPI applications by means of an intermediary script, such as the following shell script:



   exec strace -tt "$@" 2> strace.${STRACE_SUFFIX}

In this article, this script is referred to as mpi_strace. To use mpi_strace, include it on the mpirun line of your PBS script. For example:

   mpirun -np 4 ./mpi_strace ./a.out

The script distinguishes each rank of an MPI job by referencing the $MPIRUN_RANK environment variable, set by the MPI environment. Depending on which MPI stack you are using there might be a different environment variable set (or none at all). MVAPICH (used on midnight) and MPICH (used on nelchina) both use the $MPIRUN_RANK environment variable for each task.

Each core involved in the execution of the job will have a unique rank identifier associated with it. The script utilizes these unique rank identifiers to produce separate strace output files for each rank of the MPI job, named after the rank that produced them. The mpirun example above, which uses four cores, might produce the following files (depending on the node it runs on):


In the script, "$@" holds the original command passed to the rank, including all arguments. This allows the original command issued by mpirun to pass through mpi_strace unchanged. Thus, mpi_strace is essentially transparent to mpirun. strace output is written to stderr, so 2> attempts to redirect strace output to a file while leaving the original command's output, written to stdout, unchanged. One unfortunate consequence of this approach is that if the original command writes anything to stderr, it will get written to the strace output file, not to the PBS output file.

Note that the node host names are included in the strace output file names. If your job spans multiple nodes, such as two four-core nodes, your output files might be named:


This is quite beneficial, as the MPI errors that accompany mysterious job behavior are often as cryptic as this:

  Some rank on 'mt003' exited without finalize.

In this case, your investigation should most certainly start by examining the last few lines of each of the strace.mt003.* files. I recommend using the tail command to print the last 20 lines of each strace file for the node:

  tail -n 20 strace.mt003.*

Also included in the above mpi_strace script is an option supplied to the strace command itself. The -tt option prefixes each line of strace output with the time of day, including microseconds. This can be useful to pinpoint oddities in your code's execution.

Additionally, if you would like to know how much time is spent in each kernel interaction, you can use the -T strace option, which appends the total amount of time spent in the system call for each line. However, if you are concerned about system bottlenecks as they apply to your code, the -c strace option may be even more useful. The -c option generates a summary of system calls, with timing information, after the execution of the original command has completed.

For example, the output of the -c strace option may look similar to this:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 61.40    0.536680        6793        79           write
 25.11    0.219472        1814       121           read
 12.38    0.108171         778       139       107 open
  0.50    0.004341          96        45           ioctl
  0.16    0.001407         469         3           mlock
  0.15    0.001282          85        15           munmap
------ ----------- ----------- --------- --------- ----------------
100.00    0.874034                   700       126 total

This summary output was generated by running the same dataProcessor program used for Part I of this article. In this particular example, I used dataProcessor to process a 617 MB file to and from $WRKDIR on midnight. Since strace tracks only the interactions between a process and the kernel, all time percentages are relative to the total time spent in system calls as tracked by strace, not accounting for CPU time. Another downside of the -c option is that it replaces all of the typical strace output with this short summary histogram.

In all honesty, strace is my last resort tool. When I have run out of ideas, whether I think it will help or not, I run strace. This usually proves fruitful. If strace does not give me the outright solution to the problem, it helps me brainstorm. Suspicious strace output is also incredibly valuable information to include in help requests to User Support.


Fall 2007: ARSC User Training

This fall, ARSC will once again be providing training in conjunction with Physics 693- Core Skills for Computational Science. The Core Skills class is taught jointly by ARSC and the UAF Physics department and provides an introduction to basic skills needed to work in a high performance computing environment. Individual lectures are open to ARSC users and will be held Tuesday and Thursday in West Ridge Research Building room 009, except where noted.

Lectures Include:

 September 11   Introduction to Unix
 September 13   Introduction to Linux Systems
 September 18   Data Management / Unix Scripting
 September 20   Viz 1: Integrated Data Viewer, Part 1
 September 25   Introduction to Fortran Part 1
 September 27   Introduction to Fortran Part 2
 October 2      IBM Introduction, Compilers and Makefiles
 October 4      IBM Loadleveler
 October 9      Viz 2: Integrated Data Viewer, Part 2
 October 11     Midnight SUN Introduction, Compilers & PBS (WRRB 010)
 October 16     Performance Programming, Part 1    
 October 18     Performance Programming, Part 2
 October 23     Viz 3: Importing Data and Graphics Formats
 October 25     Viz 4: Animation 101

For a complete schedule see:

What Like a Pirate?

September 19th is "International Talk Like a Pirate Day." The web site does have instructions to aid the land-locked:

Seems a reasonable excuse ter make a fool a yerself in public this month.


Quick-Tip Q & A

A:[[ Let's help out those university students heading to off-campus
  [[ housing.  What's your tip for impoverished students?

# Jim Williams
If you have roommates, don't put the phone in your name unless you
turn off long distance calling....

# Derek Bastille
My big tip is:  milk crates.  Yep, milk crates are your best buddy
as a student.  They can be used for shelves, storage, a TV stand,
computer desk, bicycle basket and many other things (I've still got
several from my old student days ;-)

... ahh milk crates ...

# Greg Newby
Resist the temptation to use a credit card unless you pay it off
every month.  Since college is a time many people get loans to pay
for their education, credit cards sometimes appear like another loan
towards academic and career success.  However, the very high interest
rate and deluge of credit card applications for students can leave you
a slave to monthly payments, without any more money in your pocket.
Perhaps even less money, since so much will go to making payments.

Instead, avoid debt other than needed student loans.  Any sort of loan
that you have to make payments on while still in school -- especially
credit cards -- will necessitate additional out-of-pocket expenses
while in school, to pay them off.

# Rich Griswold
I recently attended one of Dave Ramsey's Financial Peace University
seminars, and I learned a lot about managing money that I wish I had
known years ago when I was still an impoverished student.  Even if
you aren't making much money, he has great tips for making the money
you have work for you.  If you can't attend a seminar, listen to his
program on the radio or check out one of his books, such as "Financial
Peace Revisited".

# Suzanne Noll
A HUGE bag of rice!

And another tip: do NOT boil hot dogs in Boone's Farm..... 

# Don Morton
Best bargain in any town - Macaroni & Cheese and Bisquick.  You can
live for ages off that stuff at very low prices.

Also, y'all probably have rules about printing this sort of thing
in the ARSC Newsletter, but I should note that in ANY of this stuff,
you can use beer rather than water for cooking, and it's very good,
especially bisquick.

# Nathan Bills
Top Ramen is still 8 packages for a dollar :)

# Patrick Webb
Note that by getting some frozen (or fresh) vegetables and throwing
them into ramen turns a ultra-cheapo and nutritionally dubious meal
into something much more satisfying and tasty with not much extra
cost. If you're feeling really adventurous, cook up some chicken
beforehand and throw that in. Ramen makes an excellent noodle soup base.

# John Chappelow
When I first moved to Fairbanks from UVermont (to do my PhD) I had
the idea I could exist as a grad student without a car (like I could
"back East"). Bad plan. Get a car. Or a Space-suit.


Seriously, the best advice I can give an incoming grad student (here
or elsewhere) is to (1) take enough loan to survive a year; (2) once
there, ASK FOR ADVICE from grad students that're already there; (3)
learn how to raid buffet tables, early and often; (4) do some serious
research before you rent an apartment or cabin--some places are so
disgusting (or dangerous) they're just not worth it. You might contact
current grad students in your department ahead of moving--some of
them might be willing to "take you in" for a week or two while you
look around.

# Ed Kornkven
(1) If you elect to put up a dartboard, be advised that the accuracy
    of the players goes down the farther one has to throw the darts,
    and the number of doorways the dart has to pass through.
(2) Your share of the damages at the end of the year will depend
    on how many people shared the apartment, not on what percentage
    of the darts you actually threw.
(3) Most people aren't very good at darts.

# Daniel Pringle
Add veggies to ramen. Ramen is cheap, but scurvy isn't. 


Student loans bite, but using cash rather than plastic helps. Take
out your cash for the week, and when its done, its done.

# Editor:
Crack an egg into "ramen with veggies," and you have egg-drop soup.
Double the recipe, and you have a meal.

# Martin King
Some people solved this problem by lateral thinking---they stopped
being students and later became some of the richest and most successful
people on Earth.  Think Bill Gates, Steve Jobs, Tiger Woods, Richard
Branson... In any case, you don't necessarily need a degree in order
to have a good life.

Q: i have a whole bunch of files with uppercase letters in the names.
   unfortunately, i do not care for uppercase letters.  could you please
   tell me how to rename the files in an automated fashion such that
   there are no uppercase letters in my filenames?

   here are some of the files i have to rename along with the name i 
   would like to use:
     EARTH001.out  is renamed to earth001.out
     Wind001.out  is renamed to wind001.out
     wATER001.out  is renamed to water001.out
     fiRE001.out  is renamed to fire001.out 

   i have no idea who came up with these filenames, but i do not like

[[ Answers, Questions, and Tips Graciously Accepted ]]

Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top