ARSC T3E Users' Newsletter 131, November 26, 1997
Why a 1-PE NQS Queue
As noted in the last newsletter:
/arsc/support/news/t3enews/t3enews130/index.xml
the ARSC T3E now possesses a single-PE queue. The purpose of this queue is to run non-MPP tasks, such as creating tar files, thus freeing application PEs for parallel work.
In order to prevent NQS from locking application PEs unnecessarily, users should schedule large serial tasks to the single queue and not to blocks of application PEs. As an example of the problem, if a user requests a large number of processors and then, in the same qsub script, performs some post-processing activity on one command PE, NQS keeps a lock on the entire application PE block until the command PE job completes. The application PEs remain unnecessarily idle.
In the first example below, the user requests 42 PEs and uses them in a parallel application. Unfortunately, the 42 PEs are still held by the job, and remain idle, while the post_process_data script runs and a tarfile is created:
############################################################ #QSUB -q mpp -l mpp_t=14000 -l mpp_p=42 #!/bin/csh echo ' started ' cd /the/working/directory mpprun -n 42 ./a.out < input > output ./post_process_data < output # WRONG !!! tar cf results_my_run output # WRONG !!! echo ' completed ' ############################################################
This results in the user being billed for processors they are not actually using and it makes other parallel jobs wait for free resources while the post processing and tar operation completes. Hardly ideal.
A simple solution is to submit an NQS job to the single PE queue, from the first script, once the significant parallel computation is completed:
############################################################ #QSUB -q mpp -l mpp_t=14000 -l mpp_p=42 #!/bin/csh echo ' started ' cd /the/working/directory mpprun -n 42 ./a.out < input > output echo ' completed ' qsub make_a_tarfile ############################################################
where make_a_tarfile contains:
############################################################ #QSUB -q mpp #!/bin/csh echo ' started making tarfile ' cd /the/working/directory ./post_process_data < output tar cf results_my_run output echo ' completed making tarfile ' ############################################################
The current limits on the single queue are such that there is an 4hr runtime limit and many can run at the same time, so users shouldn't see any great delay between the finish of the MPP job and the start of the single job.
As noted last week, to get NQS to route your job to the single queue, leave the PE and run-time limit requests out of your qsub script (as the above example shows). And, as always, the queues are under review and, subject to our observations and your input, are subject to change.
ARSC Queue Policy
The policies for ethical queue behavior listed below are nearly identical to those that evolved over time on the T3D. They are complicated a bit by the single queue.
What follows is news queue_policy on yukon:
The T3E is a popular, limited resource. ARSC has found that T3E users are willing to work together to provide fair access to the queues. Please: 1) Submit/execute no more than one job per 8 or 4 hour queue at a time (with the exception of the "single" queue). 2) Do not submit/execute jobs in more than two different 8 or 4 hour queues at the same time (with the exception of the "single" queue). 3) Submit/execute no more than two jobs per 5 or 10 minute queue at a time. 4) Do not submit/execute more than five jobs at a time in the "single" queue. As an example, if user "goodman" submitted a job to the m_32pe_4h queue, he/she would not submit another to this queue until the first had run to completion. Meanwhile, "goodman" could also have one job in the m_16pe_8h queue, two in the m_16pe_10m queue and even a couple in the single queue. In general, please try to use as few processors as are necessary and be flexible in the number of processors with which your codes can run. This tends to increase the overall throughput, scheduling efficiency, and number of people able to use the system at a given time. Please contact User Services (consult@arsc.edu or 907-450-8602) if you have any questions concerning this policy. Also, please contact us if you feel that the queues are being misused, and we will work to resolve the situation. ARSC may hold or delete jobs that are submitted in violation of this policy.
As always, your input on the queues and queue policy is appreciated. Please let us if you find yukon a reasonable machine on which to work.
Supercomputing '97
Supercomputing '97 was in San Jose, CA on November 15th-21st. Quite a number of people from ARSC were in attendance, and below we present a few highlights from the perspectives of Guy Robinson and Roger Edberg:
Hardware
There was much hardware on display on the various vendor stands. Cray/SGI presented the T3E-1200 system and a 128 node Origin 2000 working in single system image. Tera was active presenting benchmarks from a system during the show. Digital announced a new collaboration with Quadrics to use Meiko networking technology to build a cluster based system and how this fits with their plans for a 100Tflop system in the early part of the next century. All other vendors made a good display of systems being used by many customers with a wide range of software packages.
In the presentations many speakers concentrated on the lack of improvement in the processor memory bandwidth interface and the relative complexity of the numerous cache structures available and the cost of programming to suit each. Solutions proposed range from a return to SIMD like architectures with large number of processors each with relatively small memories to large scale hierarchal memories with enormous switching networks controlling access.
Software
Software was also evident with a multitude of new applications being presented by vendors and independent research centres often through impressive visualisations. The simulations are becoming more interdisciplinary and the control and management of data is often a major part of a successful project.
Also, after speaking with researchers from several centres, the need to justify and demonstrate that these improvement are worthy of the expense is becoming a major issue. Software, both in terms of systems software to support operational aspects and the development of next generation applications is of interest and there is more activity in this area than in developing hardware and applications software at present. An exception to this is an openMP effort which standardises shared memory programs.
Speakers
A number of speakers discussed the socio-economic impacts of HPC systems. Several talks addressed the dramatic changes which might occur as the computing power is fully unleashed and scientists and engineers, along with other less traditional compute intensive disiplins start to make use of HPC.
The keynote speaker, Paul Saffo, pointed out that the technology changes never have the expected impact on society and the results are unpredictable.
David Brin, a Sci-Fi author and technology observer presented a talk which warned of the dangers of an information age where everyone can know everything about others. Instead of immense freedoms it can actually provide terrible controls over how people behave and a sociological byproduct might actually be less individual independence as more people conform.
Some speakers considered teraflops to be complete and were looking forwards to the advent of petaflop systems.
Booths
There were many new stands on the block, Windows NT seems to be coming of age and clusters of these systems are attracting attention. Several stands had examples and there was a tutorial presentation on the Beowulf project which detailed the important aspects to consider when building a cluster. The project has produced a super-cluster which performs at a peak of 10Gflops. PGI presented some impressive performance figures for multiprocessor PC systems using HPF to express the parallelism with minimal code development/modification efforts.
Visualization
A panel session on Thursday November 20 was devoted exclusively to visualization. Panelists were Pat Hanarahan (Stanford), Paul Woodward (Minnesota), Chris Johnson (Utah), Tom DeFanti (U. Illinois at Chicago), and Philip Heerman (Sandia Labs). Talks and discussion focused on several topics:
- problems with processing large (terabytes) data sets for visualization,
- methods and problems with 'computational steering', using visualization to guide large-scale supercomputer calculations,
- methods and ideas for data filtering and feature identification, and
- integration of simulation and visualization codes.
The panel commented that visualization is rapidly converging with simulation--expect to see tight integration of visualization and application codes in the future. Several panelists presented animations from recent work. Woodward presented some particularly striking animations from gas dynamics simulations of red giant stars.
Many booths at the Supercomputing '97 exhibition had demonstrations of visualization software and hardware.
NCSA presented a number of 3D VR demonstrations during the SC'97 exhibition. Most notable were: (1) Cave5D, an implementation of the Vis5D visualization package using ImmersaDesk VR technology, and (2) The Virtual Temporal Bone, a VR world used to teach surgical students the anatomy of the temporal bone and associated structures in the human ear. The temporal bone demonstration included live interaction with a surgical instructor located in Chicago.
The scientific visualization team at NCAR presented a 20 minute videotape of CFD/climate research highlights using a 3D rear-screen projector with Crystal Eyes glasses. The program featured animation from the Vis5D package and NCAR's 'volsh' volume rendering software. The NCAR visualization team have added 3D capabilities to Vis5D; these 3D enhancements are scheduled for inclusion in the next official Vis5D release.
NASA showcased a number of visualization projects, including 3D CFD visualization (NAS) using an ImmersaDesk.
The ARSC booth presented a videotape of visualization highlights from the past year, as well as an interactive demonstrations of a tsunami visualization and polar ionosphere model.
Thanks To All
Again Supercomputing '97 was a great experience and our thanks go to all those involved in making this such a wonderful conference each year.
ARSC Thanksgiving Reading List
We polled ARSC staff for this optional holiday article, asking what they read on the plane back from SC97, or would have, had they been to SC97. So here are some books to consider, in no special order:
The Sparrow
Recently finished _The Sparrow_ by Mary Doria Russell.
Could be classified as Science Fiction by virtue of
being set in the future, but it would be very limiting
to classify it as just Sci-Fi. A very good read, _very_
thought provoking.
Reviving Ophelia
Currently reading _Reviving Ophelia_ by Mary Pipher. A
sociological review of what teen age girls face in our
culture. I'd recommend it for folks who have teenagers
or soon-to-be teenagers. May not be exactly what their
daughters face, but undoubtedly does reflect the
experiences of some of their peers and some of the
pressures of this culture.
The Bhagavad Gita
I just started the Bhagavad Gita. Were you going to
have a "philosophy" section?
Opera for Dummies
(comes with CD of your favorite hits)
The Structure of Scientific Revolution by Thomas Kuhn
This is a classic which I already plan to reread.
Chaos, The making of a new science
by James Gleik. A "must read" for science/computer
types.
The Perfect Storm
by Sebastian Junger. Riveting story of a huge storm
over the N. Atlantic in 1991 which sank the
sword-fishing boat, the "Andrea Gail." Includes a lot of
technical information on meteorology, ship design, wave
formation, death by drowning, rescue at sea, etc...
as well as history of the New England fishing industry,
and the personal stories of the people and towns
involved. My father-in-law, an ex-commercial fisherman,
loved it, but says it was too critical of the fishermen.
The VRML 2.0 Sourcebook
by Ames, Nadeau and Moreland -- technical
Scientific Visualization
by Neilson, Hagen and Muller -- technical
Programming Python
by Mark Lutz -- technical
Forces in Motion
by Graham Lock -- biography of composer/saxophonist
Anthony Braxton
The Music of What Happens
by John Straley -- mystery by Sitka AK writer
The Atlas
by William T. Vollmann -- fiction
Kid Books! Tons of "Arthur", "Berestain Bears", and Richard Scarry w/
the kids as well.
Vegetable Heaven
Mollie Katzen's Vegetable Heaven : Over 200 Recipes for
Uncommon Soups, Tasty Bites, Side-By-Side Dishes, and
Too Many Desserts (new cookbook by the author of The
Enchanted Broccoli Forest and The Moosewood Cookbook) -
I'm going to buy it for myself for Christmas.
The Radio version of the Hitchhiker's Guide to The Galaxy.
Peace on Earth by Stanislaw Lem and "R" is for Rocket by Ray Bradbury,
...some very different views of science fiction
Lunatic Cafe by L. Hamilton
...an adventure in between genres
Hal's Legacy
...in a technical frame of mind, Hal's Legacy by David
Stork shows the future in fact.
Xanth
I plan to read all 15+ Xanth series novels by Piers
Anthony. The kingdom of Xanth is a fantasy land full of
puns and magical beings. For example you pick light
bulbs from a light plant and your shoes from a shoe
tree. The last time I read the series there were only
13 books. I've lost count of the ones I haven't read.
The Long Walk
Captured by the Soviets in 1939, and falsely accused of
spying, Polish Lieutenant Slavomir Rawicz was
eventually sentenced to serve ten years in a labor camp
located near Yakutsk in Siberia. The title of this book
refers to the principal adventure in this affair,
Slavomir's escape from confinement along with five
comrades, and their year-long trek to freedom, by foot
through such unfriendly climes as Northern Siberian,
the Gobi desert, and the Himalayas. This book was an
interesting read.
Quick-Tip Q & A
A: {{ How do you invent and remember good passwords!? }}
# Here are reader responses and one from the editors. Thanks to all.
#
##############
#
# I come up with phrases, some that make no sense to anyone but me.
#
# Here are some examples of simple passwords (which I would consider
# unsafe, but which show the idea):
#
# Here.Ugo -- Here you go
# tIme4achange-- Time for a change
# jUs.t4U -- Just for you
#
# Typically longer phrases, more meaningless to others, and more 'hacked'
# up so there are fewer real characters (but something I can associate) are
# better.
#
# Also I combine things with the above like using key letters in phrases.
# A simple one would use the 1st letter in a phrase like:
# wAd.gth -- Movie title "When All Dogs Go to Heaven"
#
##############
#
# If I told you my method, would it not cease being a good method ??
#
##############
#
# Invent a standard replacement formula for some letter. For instance:
#
# k ===> *-k+
#
# Then when you need a new password think of a word which contains
# the letter:
#
# skier
# pokey
# York
# Kim&I
#
# This leads to the following passwords:
#
# s*-k+ier
# po*-k+ey
# Yor*-k+
# *-K+im&I
#
# Your fingers get really good at typing the formula part, "*-k+".
#
# If you ever want to change all your passwords at the same time you
# can leave the words the same and come up with new formulas, say:
#
# k ===> ^X+_ <or>
#
# k ===> -273
#
# (These, of course, are not my actual formulae... but you get the
# idea.)
#
##############
#
# I use the names of my officemates' ex-girlfriends, ex-boyfriends, and
# cars. :)
#
#
# OK, actually what I do is come up with phrases that have numbers in
# them. A dumb example would be "99 bottles of beer on the wall". A
# slightly better one would be "There must be 50 ways to leave your
# lover." Then I take the first letter of each sentence and the
# number as the password (99bobotw and tmb50wtlyl). These are easy
# to remember and are easy to say in your head as you're typing them
# in. But if you're like me and have to remember a different password
# for each platform you work on, and are constantly being forced to
# change them, you keep a secret list that maps host names to just the
# numerical portion of the password (99 and 50). It's your job to
# remember the phrase that goes with each number.
#
##############
Q: What is a good formula or algorithm for apportioning K equally sized,
independent tasks among M PEs on the T3E. For instance, if K==5 and
M==3, then the algorithm might return: N(0)=1, N(1)=1, N(2)=3, which
wouldn't be as good as if it returned: N(0)=2, N(1)=2, N(2)=1.
[ Answers, questions, and tips graciously accepted. ]
Current Editors:
E-mail Subscriptions:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
-
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
-
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
