| Newsletter Index | Quick-Tip Index | Search Newsletters |
| Title: | Creating and Maintaining Effective Large-Scale Scientific Computing Applications. Balancing Ease-of-Use, Extensibility, and Performance Requirements |
|---|---|
| When: | Tuesday, January 13, 2004, 2:00-4:00pm |
| Location: | Butrovich 109, UAF Campus |
| Speaker: | Richard Barrett, Los Alamos National Laboratory |
| Abstract: | On October 2, 1992, President Bush signed into law the FY1993 Energy and Water Authorization Bill that established a moratorium on U.S. nuclear testing. President Clinton extended the moratorium on July 3, 1993. These decisions ushered in a new era by which the U.S. ensures confidence in the safety, performance, and reliability of its nuclear stockpile. The Advanced Simulation and Computing Program (ASCI) is an integral and vital element of our nation's Stockpile Stewardship Program. ASCI provides the integrating simulation and modeling capabilities and technologies needed to combine new and old experimental data, past nuclear test data, and past design and engineering experience into a powerful tool for future design assessment and certification of nuclear weapons and their components. These computational physics simulations consist of hundreds of thousands of lines of code, written using multiple programming languages. These codes must execute accurately, consistently, and efficiently on a variety of computing platforms throughout their multiple decade lifetimes. They must withstand the participation of many code developers, each of whom brings different skill sets to the project. They must adapt to dynamic user requirements, and thus be amenable to the inclusion of new algorithms and other improvements. Barrett's focus is on abstracting the necessary complexities of the distributed memory parallel processing environment in a way that is natural to the code developer, yet enables the incorporation of sophisticated computer science ideas Under-the-hood. Barrett will illustrate how these requirements have been managed by describing a variety of specific applications and computational kernels. These applications include hydrodynamic algorithms operating on unstructured and semi-structured dynamic meshes, various radiation transport approaches (Sn and Monte Carlo), and an approach to solving linear systems when the system properties are poorly understood. |
| About The Speaker: | Richard Barrett is co-author of: "Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods and contributing programmer for the ASCI Stockpile Stewardship Program" Barrett has been a technical staff member at Los Alamos National Laboratory for the past ten years, contributing to a variety of physics simulation code development projects, mainly within the ASCI program. Prior to going to Los Alamos he was a charter member of the Innovative Computing Laboratory at the University of Tennessee. |
[ Thanks to Kate Hedstrom for this article. ]
I am in the process of setting up a large model run using the ROMS model. One of the initialization steps is to interpolate a friend's model fields onto my grid. He supplied the Matlab routines for doing the job, but my grid is so large that Matlab ran out of memory, due to the 2 GB limit for 32-bit executables. (Let's not get into how I feel about depending on Closed Source tools, but concentrate instead on my solution.)
All of our grid, forcing, and initial conditions files are in the NetCDF format (references below). The model grid is Arakawa-C, meaning that variables are not all co-located on the grid, but the grid is a structured rectangle. A look at the header (ncdump -h) will show the dimensions:
netcdf NPAC_grid_10 {
dimensions:
xi_psi = 2113 ;
xi_rho = 2114 ;
xi_u = 2113 ;
xi_v = 2114 ;
eta_psi = 1345 ;
eta_rho = 1346 ;
eta_u = 1346 ;
eta_v = 1345 ;
Here, the four xi dimensions are all in the i direction, while the four eta dimensions are in the j direction. I figured that I could create smaller grids and process each one in turn, then merge them all together at the end.
My first attempt was to split it in half in both the i and j directions, making each a quarter of the original in size. It was still too big! I would have to try smaller pieces. Because the grid is staggered, I realized it would be easier to cut in just one direction. My favorite cutting tool is "ncks" from the NetCDF operators package. Here's the command to create one grid strip (I made 11 subgrids, from a to k).
ncks -d xi_rho,600,801 -d xi_psi,600,800 -d xi_u,600,800 \
-d xi_v,600,801 NPAC_grid_10.nc NPAC_grid_10d.nc
The last two arguments are the original and output NetCDF files. I'm splitting on all four of the i dimensions.
The 11 subgrids thus created were finally small enough, and I processed them successfully with Matlab. This created 11 subgrids of the initial conditions, which then needed to be glued back together.
NCO can concatenate multiple records, but I don't think it can handle our case of patching these slightly overlapping grids. Instead, I turned to one of the many other packages with NetCDF support, the NCAR Command Language (NCL). First, I let Matlab create the large file with the right dimensions, although it didn't write the values into it. Here is part of the NCL code to glue "ubar", a variable on the xi_u points:
begin
; Create the file handles
ininame = "NP_10_ini_jan00.nc"
ncout = addfile(ininame, "rw")
in_1 = "NP_10_ini_jan00a.nc"
ncin_1 = addfile(in_1, "r")
:
in_11 = "NP_10_ini_jan00k.nc"
ncin_11 = addfile(in_11, "r")
; Read in the fields, including the empty big one
ubar = ncout->ubar
tmp_a = ncin_1->ubar
:
tmp_k = ncin_11->ubar
; Copy the subgrids into the big grid
ubar(:,:,0:200) = tmp_a
ubar(:,:,200:400) = tmp_b
:
ubar(:,:,2000:) = tmp_k
; Write out the resulting ubar
ncout->ubar = ubar
end
This needed to be repeated for each variable.
Finally, to make sure that the glue worked, I tried plotting the resulting fields with ncview. It failed to allocate enough memory for the 3-D fields, since it too was compiled 32-bit. The 2-D fields look reasonable, although they are too large for the whole grid to be plotted on my monitor. People working with small grids are so lucky!
In conclusion, NetCDF is a wonderful file format and there are many tools which can not only read and write these files, but also slice and dice them.
References:
http://www.unidata.ucar.edu/packages/netcdf/index.html
http://nco.sourceforge.net/
http://ngwww.ucar.edu/ncl/
http://meteora.ucsd.edu/~pierce/ncview_home_page.html
(Note, the tools mentioned in this article are not available on all ARSC platforms. Contact consult@arsc.edu if you can't find what you need.)
Beginning with the X1, Cray has adapted to generally accepted mechanisms for interlanguage communication (i.e., Fortran calling C functions or C calling Fortran routines). If you're porting to the X1 from another vendor's system you may get away without making changes, but you'll have to modify your code if migrating from another Cray. Interlanguage calls are a common porting issue, regardless of platform.
If porting from an earlier Cray to the X1, here are some specific concerns:
(However, unlike earlier Cray vector platforms, the X1 supports 32-
and 64-bit data types so you must verify that C and Fortran types are
compatible. If you casually switch between different ftn "-s
Migrating Applications to the Cray X1 System - S-2378-51
Chapter 4. Interlanguage Communications
(Note, the Cray manuals are available to current ARSC users only. Read "news documents" on klondike for the current URL, login, and password.)
It's handy to know what macros are predefined on a given system. Using pre-processor logic and cpp, you can use them to compile different bits of code based on operating system, architecture, vendor, etc.
Typically, predefined macros are tested using pre-processor #ifdef's. For instance, the code might print a message identifying the system on which it's running:
#ifdef _UNICOSMP
write (*,'(A)') "Good morning. I'm a Cray X1."
#endif
There are many pre-defined cpp macros, but the most important is probably that which simply identifies the system. For ARSC's current systems, here they are:
Macro Name Defined as "1" on These Systems
============ ===============================
_AIX IBM systems running AIX
_UNICOSMP Cray X1
_CRAYSV1 Cray SV1 series
_CRAYT3E Cray T3E series
_SX Cray SX series
__sgi SGI systems
Exhaustive lists of predefined macros are available as follows:
IBM, SGI:
Execute this command:Cray (including the SX-6):
$ cpp -dM /dev/null
Search the on-line manuals for the term "predefined macros".
(Note, these manuals are available to current ARSC users, only. Read "news documents" on klondike, chilkoot, yukon, or rimegate for the current URL, login, and password.)
For more, see Kate Hedstrom's recent 2-part series of articles, "Conditional Compilation", in issues #274 and #275.
No, this is true. A user called last week complaining about a buffer overflow that crashed his MPI performance analysis tool. The sys-admins have been debugging the core file, and they're starting to suspect it was the work of Buffer, the VAMPIR slayer.
A:[[ Are data written from a Fortran "implied do" incompatible with a
[[ regular "read"? If so, is there a way to make them compatible,
[[ without rewriting the code?
[[
[[ I just want to read data elements one item at a time from a
[[ previously written file. Here's a test program which attempts
[[ to show the problem:
[[
[[ iceflyer 56% cat unformatted_io.f
[[
[[ program unformatted_io
[[ implicit none
[[
[[ integer, parameter :: SZ=10000, NF=111
[[ real, dimension (SZ) :: z
[[ real :: z_item, zsum
[[ integer :: k
[[
[[ zsum = 0.0
[[ do k=1,SZ
[[ call random_number (z(k))
[[ zsum = zsum + z(k)
[[ enddo
[[ print*,"SUM BEFORE: ", zsum
[[
[[ open(NF,file='test.out',form='unformatted',status='new')
[[ write(NF) (z(k),k=1,SZ)
[[ close (NF)
[[
[[ zsum=0.0
[[ print*,"SUM DURING: ", zsum
[[
[[ open(NF,file='test.out',form='unformatted',status='old')
[[ do k=1,SZ
[[ read(NF) z_item
[[ zsum = zsum + z_item
[[ enddo
[[ close (NF)
[[
[[ print*,"SUM AFTER: ", zsum
[[ end
[[
[[ iceflyer 57% xlf90 unformatted_io.f -o unformatted_io
[[ ** unformatted_io === End of Compilation 1 ===
[[ 1501-510 Compilation successful for file unformatted_io.f.
[[ iceflyer 58% ./unformatted_io
[[ SUM BEFORE: 5018.278320
[[ SUM DURING: 0.0000000000E+00
[[ 1525-001 The READ statement on the file test.out cannot be completed
[[ because the end of the file was reached. The program will stop.
[[ iceflyer 59%
#
# Thanks to Jim Ott:
#
They are incompatible. When the write statement is used with the do
inside, the values are written out one after the other, as one line.
write(NF) (z(k),k=1,SZ) : one line of data
When the read statement is within a do loop, each read starts from a
new line. The file test.out has 1 line of data with SZ entries, not SZ
lines with 1 entry each. I am not sure how to correct the problem w/o
rewriting the code. The simplest method would be to read them in as
one line, taking the read outside the do loop.
#
# ... BONUS ANSWER ...
#
# Here's yet another slick solution to the grep+find question.
# Thanks to Dale Clark:
find . -name "*.f" -exec grep -i flush6 {} \; -print
# The trick is putting "-print" last...
#
# When "grep" matches something, it returns 0. When the value
# returned to "-exec" is 0, then "-exec" returns TRUE and find must
# evaluate its next expression, "-print". Thus, the file name is
# only printed when the file contains a match.
Q: My overworked nephew, the elementary school teacher, emailed yet
another arithmetic assignment to his 6th graders without knowing the
answers first. I'd like to help him out. Is there a quick way I can
compute the answers for him? The problems all have the same basic
format, as shown in this sample from his ASCII email:
myworkstation$
myworkstation$ tail -n 12 mathproblems.txt
14) Division problems:
56 / 2.34 = ________
2.34 / 56 = ________
15) Multiplication:
33.3 * 12.3 = ________
12 * 22 * 8 = ________
33.3 * 12.3 * 12 * 22 * 8 = ________
Extra Credit:
1 + 2 + 3 + (4) * 2 = ________
1 + 2 + (3 + 4) * 2 = ________
1 + (2 + 3 + 4) * 2 = ________
[[ Answers, Questions, and Tips Graciously Accepted ]]
Contact:
Thomas J. Baring ARSC Web Specialist ph: 907-450-8619 Donald Bahls ARSC User Consultant ph: 907-450-8674 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
Send comments and questions to the current editors using this Contact Form.Email Subscriptions:
| Newsletter Index | Quick-Tip Index | Search Newsletters |
Arctic Region Supercomputing Center
PO Box 756020, Fairbanks, AK 99775 |
voice: 907-450-8600 |
email:
home | search | about | support | news | science | resources