ARSC HPC Users' Newsletter 368, August 24, 2007

HYSPLIT Modeling System, Seminar and Workshop, Sept 11-13

A public science seminar and hands-on workshop at ARSC will provide an in-depth introduction to HYSPLIT.

HYSPLIT has applications for the movement of dust, smoke and volcanic ash, as well as other types of trajectory and dispersion applications.

The seminar is for a general scientific audience, and open to the public. The hands-on workshop is geared towards computational scientists seeking first-hand experience with HYSPLIT. Both are free, but the workshop requires pre-registration.

For details and registration:

ARSC AccessGrid Remote Control Device Project - Part I

[ By Paul Mercer ]

A new ARSC undergraduate student project that uses the AccessGrid technology to control devices over IP is now operational. The ARSC Remote Control Device Project allows people from anywhere in the world to collaboratively manipulate bots and cameras to complete a predetermined task.

Visit the project's web page for more information, including a video of the bots in action:

The goals of the project include:

  • Expand and demonstrate the capabilities of the AccessGrid (
  • Create a tool to practice collaboration skills
  • Create opportunities for students in Alaska and worldwide to learn about each other
  • Increase interest among prospective students that UAF is a great place to attend college
  • Expand ARSC's outreach
  • Create an ARSC student project

The current "task," although simple, requires the users to move balls from one level to another. To test the skills and ingenuity of the user some obstacles have been built in. For instance, there is no way to get between the levels. The user must construct ramps using the resources on the table but the absence of ramps is only the beginning. The bots have very limited functionality but can complete some very sophisticated maneuvers.

The project is in its first stages. Interested parties are encouraged to download and install the software and give it a try. A "how to" web page is at:

Later, organizations can become involved with task planning, bot building/programming or host their own project.

The project's lead programmer is Devin Jones, a junior Computer Science student at UAF. Devin has expanded and improved the original AG Device Control Service, developed by the Australian National University. Other students contributing to the project are Beau Berryman, Farrin Reid, Kevin Galloway and James Halliday. ARSC has had a tradition of hiring graduate and undergraduate students since 1994. Since then, over 60 UAF students have exercised their work skills while making valuable contributions to ARSC.


In the second part of this series, I'll give you a taste of the programming that makes it all work.

Multicore and New Processing Technologies Review - Part I

[ Thanks to Greg Newby for this two-part review ]

On August 13-14, 2007 ARSC hosted a Symposium on Multicore and New Processing Technologies. This event brought together experts from academic and HPCMP communities to focus on new and next-generation technologies. The main goal of the symposium was to review and assess these technologies and their promise for high performance computing (HPC) work.

Attendees came from ARSC, George Washington University (GWU), the Air Force Rome Laboratory, U.S. Army Engineer Research and Development Center, and Cray. The symposium schedule and presentation materials are online at:

The symposium included briefings on ARSC's recent studies and analysis of FPGAs, multicore processors, and new programming languages.

ARSC is a charter member in the NSF Center for High Performance Reconfigurable Computing (CHREC) and has an ongoing research collaboration with GWU, one of CHREC's partner institutions. Work with GWU for the past three years has included a focus on the use of field programmable gate arrays (FPGAs) for HPC.

This year, several GWU graduate students spent the summer at ARSC, continuing work on FPGAs, and adding emphasis on multicore processors, including the CELL processor. FPGA work has taken place on Nelchina, ARSC's Cray XD1 supercomputer. One of the goals for the summer was to work on software libraries that would make it easier for programmers to benefit from the faster performance that FPGAs can offer. Symposium attendees agreed on the need for libraries to ease adoption of new processing technologies.

Another new technology discussed was the use of graphics processing units (GPUs), as found in high-end graphics cards, for HPC work. Some of the most important libraries for HPC work (including BLAS and LAPACK) are already available for GPUs, which can offer impressive speedups for single precision floating point operations. With 128 processing cores on today's high end GPUs, these can greatly outpace general purpose CPUs which have only one, two or four processing cores. However, there are bottlenecks in the use of GPUs, and they cannot handle the breadth of a typical HPC workload.

Another important limitation on the use of GPUs for HPC is their physical size. These are full-length cards, often taking two physical slots to accommodate their on-card fan and other components. Such cards are difficult or impossible to fit in typical HPC form factors, such as Midnight's x2200 compute nodes (each is under 2" tall). Nevertheless, the symposium attendees thought that the performance of the GPUs (in terms of FLOPS per Watt) will make them appealing for workstation-class computation. The future of GPUs for HPC work will largely depend on whether the limitations above are addressed before other solutions (FPGAs and larger-scale multicore processors) can outperform GPUs.

Thanks to the large market for high-end gaming and graphics work, there remains a likelihood of continued progress in GPUs. One of the most exciting developments is the emergence of new languages to make better use of GPUs, such as Nvidia's CUDA. These will help bring the power of GPUs to mainstream programmers.

The symposium had several presentations and discussions on today's multicore processors. Many computers today, from notebook computers to supercomputers, have dual core processors. ARSC's Midnight system is one such example. In the near future, quad core processors will become more prevalent. This trend is expected to continue for at least several years, as processor technology changes from increases in the clock speed of CPUs to increases in the number of processor cores on a single physical socket.

More processors on a socket generally means less power consumption per processor, and more processors in the same physical space -- these are of keen interest to supercomputing centers. But ARSC and others have found that doubling the number of processors on a socket doesn't mean you can get twice the computing power, at least for most HPC applications and benchmarks.

At the symposium, ARSC's Ed Kornkven presented findings of comparison studies of Midnight's smaller x2200 nodes to the larger x4600 nodes. The x2200 nodes have four processor cores on two sockets, while the larger nodes feature sixteen cores on eight sockets. Memory is consistent at 4GB per processor core. Across numerous HPC applications and benchmarks, the x2200 nodes were found to be more efficient per core.

There are a few exceptions to this finding, though. First, some applications really need the full 64GB of memory offered by the x4600 nodes. Such applications couldn't fit on the smaller node at all. Secondly, there are some applications that have a more randomized access pattern to memory and the Infiniband interconnect, which seems to even out the discrepancies across the smaller and larger node types.

The GWU graduate students were able to identify a key reason for the difference. GWU doctoral student Abdullah Kayi found that cache coherency checks, which happen for every access to main memory, needed to traverse additional 'hops' on the HyperTransport bus for the x4600s, but only needed one hop on the x2200s. ARSC is continuing to investigate these findings, to offer advice to users on which node type is best suited for particular jobs.


In the next issue, I'll present conclusions and recommendations made by symposium attendees.

Web Search Demystifies Pathscale Fortran Error Message

[ By Oralee Nudson ]

A user recently reported receiving the following pathscale Fortran error message when compiling on midnight:

  pathf95 INTERNAL: Unable to open message system.

Unfortunately this error message didn't offer many clues as to its cause. Therefore, we stepped through the typical diagnostic sequence to identify the problem:

  • Is the correct pathscale module loaded? Yep, "module load pathscale-2.5".
  • Are other users able to replicate the same error message? Nope.
  • Does the error message persist after terminating the current shell session and logging back in? Yes.

Hmmm. It appears that we had reached a dead end. Where else could we have looked? Low and behold a google search dug up the solution we were looking for.

Conducting a search on the exact error message returned a link to the pathscale documentation. Within the "Installing the PathScale Compiler Suite" document, the error message was presented along with a simple way to fix it. In this case, the user had set his NLSPATH environment variable to work on an unrelated issue. Because the compiler uses the NLSPATH environment variable to store the location of its message catalogs, the user had overwritten the compiler's default settings and blocked it from being able to reach its "message system". By unsetting NLSPATH within his shell, the user was able to continue compilation and move forward.

Moral of the story? Don't forget to try a good olde fashioned web search when you receive an ambiguous error message.

Quick-Tip Q & A

A:[[ I can't seem to remember what optimization flags I used to compile
  [[ my program.  Is there any way for me to tell by just looking at the
  [[ executable?

  The answer to this question depends on the compiler that you
  are using.  The PathScale compiler comes with a utility called
  "pathhow-compiled" to let you see compile information in object files
  or executables.  I couldn't find any other compilers suites that have
  something like this, but that's not to say other examples don't exist.

  Here's an example of how pathhow-compiled works:

  # Compile options for an object file
  mg56 % pathhow-compiled slv_part1.o 
  EKOPath Compiler Version 2.5 compiled slv_part1.f90 with options: -TENV:PIC -O2 -march=opteron -msse2 -mno-sse3 -mno-3dnow -m64
  # Compile options for an executable
  mg56 % pathhow-compiled slv_part1   
  EKOPath Compiler Version 2.5 compiled slv_part1.f90 with options: -TENV:PIC -O2 -march=opteron -msse2 -mno-sse3 -mno-3dnow -m64
  EKOPath Compiler Version 2.4.99 compiled ../../libfoobar/fnord.c with options: -O3 -OPT:fast_stdlib=off -march=anyx86 -msse2 -mno-sse3 -mno-3dnow -m64
  EKOPath Compiler Version 2.4.99 compiled ../../libf/fio/main.c with options: -O3 -OPT:fast_stdlib=off -march=anyx86 -msse2 -mno-sse3 -mno-3dnow -m64

Q: Let's help out those university students heading to off-campus
   housing.  What's your tip for impoverished students?

[[ Answers, Questions, and Tips Graciously Accepted ]]

Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top