ARSC HPC Users' Newsletter 209, December 1, 2000
- UAF Colloquium Series: Jon Genetti, Dec. 7
- Book Review: Parallel Programming in OpenMP
- Updated Reading List on High Performance Parallel Processing
- CUG SUMMIT 2001 Call For Papers: Deadline Dec.8
- Quick Tip
UAF Colloquium Series: Jon Genetti, Dec. 7
The UAF Department of Mathematical Sciences and ARSC are jointly sponsoring a Mathematical Modelling, Computational Science, and Supercomputing Colloquium Series.
The schedule and abstracts for the '00-'01 academic year are available at:
The next presentation:
Bringing Space Into Focus Dr. Jon Genetti Computer Scientist San Diego Supercomputer Center
Date: Thursday, December 7, 2000 Time: 2:00-3:00 PM Location: Natural Sciences 202ABSTRACT
The San Diego Supercomputer Center collaborated with the American Museum of Natural History to produce a visualization of the Orion Nebula for the new Hayden Planetarium.
During the Space Show, viewers are transported 1500 light years to the heart of the nebula on an 87 foot digital dome consisting of 9 million pixels. An alternate version was produced for flat displays and was recently shown at Siggraph in the Electronic Theater.
The talk will cover the following topics:
- overview of the project and technical challenges
- description of the visualization process that began with Hubble Space Telescope imagery and ended with simulated Orion Nebula images from any viewpoint
- development of the rendering software
- content generation for alternate (non-flat) displays
- current work of visualizing cells for cancer research
- future work of visualizing the Aurora Borealis
Jon Genetti is a Computer Scientist in the SDSC Scientific Visualization group. His current interests are designing and implementing algorithms for out-of-core visualization, volume rendering, and medical imaging.
Jon received his Ph.D. degree in Computer Science from Texas A&M University in 1993. Prior to arriving at SDSC in 1996, Jon spent three years as a Visiting Assistant Professor at the University of Alaska Fairbanks.
Book Review: Parallel Programming in OpenMP[Review by Tom Baring.]
Parallel Programming in OpenMP Chandra, Dagum, Kohr, Maydan, McDonald, Menon ISBN 1-55860-671-8 Academic Press Copyright 2001 230 pages
This is a great introduction to parallel programming; it provides both introductory and in-depth matter on the OpenMP API. It's well-written, enjoyable, and succeeds in the goal it sets in the Preface (p xiii):
"...the main information available about OpenMP [was] the OpenMP specification (available from the OpenMP Web site at www.openmp.org). Although this is appropriate as a formal and complete specification, it is not a very accessible format for programmers wishing to use OpenMP for developing parallel applications. This book tries to fulfill the needs of these programmers."
Unlike many programming texts, the authors don't bury the reader in code. There are examples throughout, but big sections of code required by dumb machines but annoying to smart people are generally left out. In Chapter 2, we meet our first parallelized loop (p23):
subroutine saxpy z,a,x,y,n) integer i,n real z(n), a, x(n), y !$omp parallel do do i = 1, n z(i) = a * x(i) + y enddo return end
And are treated to a clear, informative description (p24):
"These ... threads divide the iterations of the do loop among themselves, with each thread executing a subset of the total number of iterations. There is an implicit barrier at the end of the parallel do construct."
Did you know about the implicit barrier? The OpenMP standard is small, but there are important details like this on perhaps every topic.
There are chapters on loop-level parallelism, parallel regions, and synchronization. Each chapter describes the generic concepts (applicable in any programming model) and then focuses on OpenMP solutions for the applications programmer.
The final chapter, of interest to those with production codes, covers performance issues. It starts with Amdahl's law and factors (like load balance) that affect performance, and proceeds to issues specific to different architectures (cache vs vector, for instance). To whet your appetite (p203):
"We have given reasons why using more threads than the number of processors can lead to performance problems. Do the same problems occur when each parallel application uses less threads than processors, but the set of applications running together in aggregate uses more threads than processors?"
Perhaps 10% of the examples are in C/C++, and attention is paid to particular C++ issues. The Fortran concepts translate to C however, so this shouldn't be a problem.
My biggest complaint is that the book ignores the existence of Fortran 90/95. I noted a total of one (1) example of Fortran 90 array syntax, nothing on modules or Fortran 90 intrinsics, and no discussion of Fortran 90 issues.
I give this book a strong recommendation. Co-editor Guy Robinson must like it too, as he's assigned it as a required text for his graduate course, "Parallel Programming for Scientists," next semester at UAF.
Updated Reading List on High Performance Parallel Processing[ Many thanks to Guy Robinson for periodically sharing his reading list via this newsletter. Guy requests additional suggestions and reviews. ]
MPI information sources.
MPI: The Complete Reference. Snir, Otto, Huss-Lederman, Walker and Dongarra. MIT Press. ISBN 0 262 69184 1 (*)MPI: The Complete Reference, volume 2. Gropp et al. MIT Press. ISBN 0262571234 Using MPI. Gropp, Lusk, Skjellum. MIT Press. ISBN 0 262 57104 8
OpenMP information sources.
Parallel Programming in OpenMP. Chandra, Kohr, Menon, Dagum, Maydan, McDonald. Morgan Kaufmann. ISBN: 1558606718
Parallel Programming Skills/Examples.
Practical Parallel Programming. Gregory V. Wilson. MIT Press. ISBN 0 262 23186 7 Designing and Building Parallel Programs. Ian Foster. Addison-Wesley. ISBN 0 201 57594 9 http://www.mcs.anl.gov/dbpp/ Parallel Computing Works! Roy D. Williams, Paul C. Messina (Editor), Geoffrey Fox (Editor), Mark Fox Morgan Kaufmann Publishers; ISBN: 1558602534 http://www.npac.syr.edu/copywrite/pcw/ An interesting review of programming languages can be found at http://www.cacs.usl.edu/~mccauley/survey/report1998/
Fortran, C, HPF, and other languages.
Fortran90/95 Explained. Metcalf and Reid. Oxford Science Publications. ISBN 0 19 851888 9 Fortran 90 Programming. Ellis, Philips, Lahey. Addison-Wesley. ISBN 0-201-54446-6 Programmers Guide to Fortran90. Brainerd, Goldberg, Adams. Unicomp. ISBN 0-07-000248-7 The High Performance Handbook. Koelbel, Loveman, Schreiber, Steele, Zosel. ISBN 0-262-11185-3/0-262-61094-9 Parallel Programming using C++. G.V.Wilson and P Lu. MIT Press. ISBN 0 262 73118 5 A Programmer's Guide to ZPL. Synder, MIT Press. ISBN 0-262-69217-1
Scientific Visualisation, Overviews, Methodologies and Techniques, Nielson, Hagen and Muller. ISBN 0-8186-7777-5, IEEE order number BP07777. Visual Explanations : Images and Quantities, Evidence and Narrative by Edward R. Tufte ISBN: 0961392126 Envisioning Information by Edward R. Tufte ISBN: 0961392118 The Visual Display of Quantitative Information by Edward R. Tufte ISBN: 096139210
Numerical Recipes in Fortran 77 and Fortran 90 : The Art of Scientific and Parallel Computing. William H. Press, Saul A. Teukolsky, William T. Vetterling, Brian P. Flannery Cambridge Univ Pr (Pap Txt); ISBN: 0521574404 ; Numerical Recipes Example Book (Fortran) William T. Vetterling, Saul A. Teukolsky, William H. Press Cambridge Univ Pr (Pap Txt); ISBN: 0521437210 Numerical Recipes in C : The Art of Scientific Computing William H. Press, Saul A. Teukolsky, William T. Vetterling, Brian P. Flannery Cambridge Univ Pr (Short); ISBN: 0521431085 Numerical Recipes Example Book (C) William T. Vetterling, Saul A. Teukolsky, William H. Press Cambridge Univ Pr (Pap Txt); ISBN: 0521437202 Numerical Recipes in Fortran 90: The Art of Parallel Scientific Computing, Volume 2 of Fortran Numerical Recipes - Press, Teukolsky, Vetterling and Flannery, Cambridge U. Press, ISBN 0-521-57439-0, 1996. Code can be downloaded (purchased) from http://nr.harvard.edu/nr/store. See also http://www.nr.com/nronline_switcher.html for web versions of the Numerical Recipes series of books for browsing.
A two volume set: (1) High Performance Cluster Computing: Architecture and Systems, (2) High Performance Cluster Computing: Programming and Applications, (R. Buyya editor)...Prentice Hall PTR, Upper Saddle River, 1998. http://www.dgs.monash.edu.au/~rajkumar/cluster/index.html In Search of Clusters 2nd Edition. Gregory.F.Pfister. Prentice Hall PTR, Upper Saddle River, 1998, ISBN 0-13-899709-8. How to Build a Beowulf. Sterling, Salmon, Becker and Savarrese. The MIT Press, 1999, ISBN 0-262-69218-X.
Techniques and Applications Using Networked Workstations and Parallel Computers. Wilkinson and Allen. Prentice Hall, Upper Saddle River, 1999, ISBN 0-13-671710-1. Building Linux Clusters. Spector, O'Reilly. IBSN 1-56592-625-0.
Debugging and Performance Tuning for Parallel Computing Systems, Simmons et al. Foundations of Parallel Programming, A Machine-independent Approach, Lewis.
The Clockwork Muse. Zerubavel. Harvard University Press. ISBN 0-674-13586-5
Background information/Fun Reading.
Hal's Legacy: 2001's Computer as a Dream and a Reality. ISBN 0 262 19378 7 High Performance Compilers for Parallel Computing. Michael Wolfe, Addison-Wesley. ISBN 0-8053-2730-4 Supermen. C.J Murray. Wiley. ISBN 0 471 04885 2 The Victorian Internet : The Remarkable Story of the Telegraph and the Nineteenth Century's On-Line Pioneers. Tom Standage, Berkley Pub Group; ISBN: 0425171698.
Chocolat, Bean Trees, Candide, Children of God, and Troublesome Offspring of Cardinal Guzman. (Guzman and Children both have pre-requisite reading.)
CUG SUMMIT 2001 Call For Papers: Deadline Dec.8
Final Call for Papers:
The form for submitting an abstract is at:
Quick-Tip Q & A
A:[[ According to hpm, my SV1 code gets 30 million cache hits/second. How [[ can I tell if this is really improving its performance? Disable caching, do another run, and compare. The "/etc/cpu" command can be used to disable data and/or instruction caching for a run of any program (see "man cpu"). For example, if your executable were named, "a.out", the following SV1 command would run it with data caching disabled: /etc/cpu -m ecdoff ./a.out You can still use hpm to measure performance: hpm /etc/cpu -m ecdoff ./a.out Q: My MPI code doesn't know until runtime how many messages the PEs will be exchanging. Given this problem, how can I match a receive to every send, as required by MPI?
[[ Answers, Questions, and Tips Graciously Accepted ]]
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.