[Menu Bar] Resourses at ARSC Science at ARSC Newsroom Support About ARSC ARSC Home

 

ARSC T3E Users' Newsletter 145, June 26, 1998

Newsletter Index Quick-Tip Index Search Newsletters

CUG Notes

ARSC staff members Kurt Carlson and Barbara Horner-Miller who attended the Stuttgart CUG were kind enough to share their notes with the T3E Newsletter. The first notes were written by co-editor, Guy Robinson:

Guy's Notes:

SV1. What is it?

The SV1 is CRAYs latest member in the vector supercomputer product line. Key features of the new line are:

Peak performance of 4 gigaflops from a single processor unit, which actually consists of four 1 gigaflop processors. These processors exploit vector cache memory which, with a suitable compiler, will increase effective memory bandwidth.
A symmetric multiprocessing architecture which builds larger systems by combining many cabinets. Scalable from entry level single cabinet systems of 32-gigaflops to multi-cabinet teraflop systems, and a current upper limit of 1 terabyte of memory.
A powerful suite of clustering tools to ease the management of such complex systems will also be provided.

Existing J systems can actually be upgraded to use the new fast processors, giving higher levels of performance but without some of the advanced clustering abilities, and Cray is expecting many sites to take this path.

With entry level systems priced at $500,000 and an aggressive trade in policy it is clear this system is intended to be the mainstream of the SGI/Cray scientific product line. More on the SV1 can be found at:

http://www.cray.com/products/systems/craysv1/intro.html.

Software tools

Several SGI/Cray staff described what would be happening in the current programming environment in the coming months/years. Along with a continual set of fixes and improved compatibility between the various SGI/Cray systems, key points are:

T3E news

SGI/Cray predicted continuing sales for the T3E systems in the coming year and reported it to be one of the most successful MPP platforms ever produced with several 1000+ node systems either in place or being delivered shortly and many systems with over 500 nodes in production.

A new internal network is now available which increases bandwidth from 350MBytes/sec to 420Mbytes/sec but other hardware changes are unlikely to occur unless there is demand from users. (The current hardware could accept a 750Mhz processor.)

A tutorial was held on the problems of scheduling MPP systems and several available products were reviewed. All present agreed there was no perfect solution and that there was much to be said for sites determining a best practice rather than seeking a holy grail.

Barbara's Notes:

Cray User Group (CUG) Report
Stuttgart Germany
June 14-18, 1998

This CUG was the first once-a-year-CUG and the first CUG with the new SGI management. It was the last CUG for several Cray Research corporate faces; fond farewells were bid to Bo Ewald and Irene Qualters. There was an election for four offices and a reorganization of the Special Interest Group (SIG) structure in addition to the regular general and parallel sessions.

CHANGING OF THE GUARD

Bo Ewald, Executive Vice President of SGI, and Irene Qualters, President of Cray Research, resigned from SGI a few weeks before the Stuttgart CUG. They attended the conference to say their good-byes to, and be honored by, CUG, an organization that had supported them, and been supported by them, for many years. The honoring of Bo and Irene took place at the Cray Reception on Monday night when they received tokens of remembrance presented by Gary Jensen, President of CUG. Bo was given a remote controlled Mercedes and Irene received a necklace and earring set. Both spoke briefly Monday night but their official good-byes were said at the close of the executive general session the next day.

Rick Belluzzo, President of SGI, addressed the Tuesday general session where he presented his direction for the company. He touched on the company definition (visualization, data management and computation), the market focus (time to insight), the product roadmap, operation/execution (clear responsibility and accountability through annual plans and tracking of metrics) and the business model (key industries). One of the more interesting slides in this series showed that supercomputers migrate downward: the supercomputers of yesterday are the servers of today and the desktops of tomorrow. Prior to addressing the General Session, Rick and others had held a news conference to formally announce the SV1, the J90 follow-on.

Beau Vrolyk and Earl Joseph II wrapped up the executive General Session with information on the SV1 announcement followed by Q & A. . SGI has orders for more than 500 processors. Specific information on the SV1 can be found on the SGI web pages beginning with:

http://www.cray.com/products/systems/craysv1/intro.html.

CUG ELECTION

Sally Haerer ran unopposed for CUG President. Sam Milosevich, ELILLY, won a very close race with Nick Cardo, SSD-SS, for Vice President. The race for Secretary, a position held for 11 years by Gunther Giorgi, GRUMANN, became dynamic when Gunther withdrew and Margaret Simmons, SDSC, petitioned onto the ballot. This resulted in a race between Margaret and Eric Greenwade, INEL. It was won by Margaret. Bruno Loepfe, ETHZ, won the race for Director of Europe over Michael Brown, EPCC. ??? The remaining members of the new Board of Directors were not up for re-election: Barbara Horner-Miller, ARSC, Treasurer; Shigeki Miyaji, CHIBA, Director of Asia; Barry Sharp, BCS, Director of the Americas. Gary Jensen, UIUCNCSA, completes the Board as Past President.

SIG REORGANIZATION

Following a recommendation by the Future of CUG Committee, the Board of Directors, reorganized the SIGs into a two-tiered structure. Under the new structure, there are five Group SIGs, each of which is comprised of several Focus SIGs. The Board appointed Chairs for each of the five SIGs and for many of the initial Focus Groups. On Thursday afternoon, the SIGs met to discuss organizational issues and to confirm the focus areas within the SIG. The SIG organization which will carry forward to the Minneapolis CUG next May will be

Computer Center Management Group Chair - Mike Brown
User Services Chair - Leslie Southern
Operations Chair -

Communications & Data Management Group Chair - Hartmut Fichtel
Mass Storage Chair -
Networking Chair - Hans Mandt

Operating Systems Group Chair - Chuck Keagle
UNICOS Chair - Ingeborg Weidl
IRIX Chair -
Security Chair - Virginia Bedford

Programming Environments Group Chair - Jeff Kuehn
Compilers & Libraries Chair - Hans-Hermann Frese
Software Tools Chair - Guy Robinson

High Performance Solutions Group Chair - Eric Greenwade
Applications Chair - Larry Eversole
Visualization Chair -
Performance Chair -

TIDBITS FROM SESSIONS


SESSION NOTES

Hardware:

Software:

Service:

Training:

Documentation:

Kurt's Notes:

Scheduling and Configuration Tuning for the T3E
Jim Grindle, SGI/Cray, Mgr U/mk Engineering jsg@cray.com

Tutorial covering GRM (global resource manager) and psched (political scheduler).
psched consists of GS (Gang Scheduler), LB (load balancer), and MUSE (multi-layered user scheduling environment). GS & LB have seen lots of fixes with 2.0.3 of U/mk.
Note in Origin floating divide by zero does not cause program fault by default.
Note comparisons for tuning parallel code for origin.
Note F90 option: -PHASE:flist on (for debugging, presently undocumented).
See information on OpenMP (recommended standard).

General Session

SGI/Cray Monitoring Tools
Randy Lambertus, SGI/Cray,

Proposed future integration of support tools (e.g., vaporware):
Proposed to be available next year for Irix; written for NT as well.
Intent is to provide for U/mk (unknown: "Efforts all directed to IRIX now").

Industry Directions in Storage
Mike Anderson, SGI/Cray

Seagate & IBM are primary players in high performance disks. Quantum bought out by Matsushita (sp).
Market dominated by desktop (70% of units), roughly 17% is high performance and 13% mobile.
CD-RW likely to takeover CD market by 2000-2001.
Industry has not accepted IBM SSA disks.
Fibre channel-0wid has industry acceptance.
Capacities growing (expect 40gb drives by end of 1998).
MTBF measurement varies... for some it's when 2/3 have failed, for some it's measured by returned failed drives (many of which are just thrown away so by measurement they're still ticking); useful life expectancy is 5 years, but economic life may be less than actual life.
LTO (Linear Tape Open) (www.lto-technology.com) is a new emerging media type/standard... near term expect 100gb capacity, expect 800gb futures.
Super-DLT (100gb/cart) also should be out by 1999, SGI will support when available.
STK Eagle will be released June 1998, SGI will need 4 months for validation testing once released.

General Session

SGI/Cray Research Corporate Vision
Rick Belluzzo, CEO, SGI

Data management; Visualization; Computation. Want to dominate "Time to insight" (modeling and simulation). Operational changes to improve efficiency. Change business model (profitability). Execution & results: clear responsibilities and accountability.
See convergence of vector and traditional. 6 key industries: Mfg., Gov., Ent./Media, Energy, Science, Com. Execution & results: clear responsibilities and accountability.

SGI/Cray Research Corporate Operations Report
Beau Vrolyk, SGI

Review of SV1 announcements...
Already sold 20m system, orders for 500 processors already.
Entry price of .5m (cheaper), based on J90 technology.
J90->SV1->SV1e->SV2.
Targeted core markets.
5x performance of J90.
5x price performance improvements over J90.

OpenMP Programming Model
Ramesh Menon, SGI/Cray
See: http://www.sgi.com/Technology/OpenMP and http://www.openmp.org

Motivation: no portable standard for shared memory parallelism, each vendor had proprietary SMP.
SGI is leading, joined by HP, Intel, Sun, IBM, DEC, etc.
Presently a consortium, incorporating as a non-profit.
Fortran v1.0 spec due out 10/98ish.
C/C++ v1.0 spec due out 8/98ish.
Validation suite for Fortran planned.
Salient features: fine & coarse parallelism; incremental parallelization; provide access to strengths of shared memory (e.g., avoid message passing); exploit cache coherent scalable hardware.
Interoperability: can mix with MPI and PVM.
shmem & pthreads not supported initial version.
Due to architecture, will NOT be supported ever on T3E.

OpenMP: a Multitasking and Autotasking
Perspective, Neal Gaarder, SGI/Cray

Direction towards OpenMP: standard, preferred alternative.
All IRIX compilers (7.2).
PVP 10.0.0.3 and PE 3.1.
Not for YMP (10.0 required) or T3E.
Conditional compilation:
#ifdef_OpenMP
!$, c$, *$ directives.
Conversion to OpenMP:
gradual (intermixing directives).
Paper has more details.

Update of System Management Software for Large Origin Systems
Dan Higgins, SGI/Cray

Data center quality and HPC functionality into IRIX.
Share II (fairshare) in IRIX 6.5.
Miser in 6.5 (miser API a future).
Checkpoint/Restart: CPR 1.0 in Irix 6.4.
CPR 1.1 w/6.4 update (pthreads and fixes).
CPR 1.2 w/6.5 (shmem).
Resource limits (udb-like capabilities).
Accounting 1H99 Cray-style project ids & reporting (csa).
Enhancements to Array Services (Irix clusters).
NQE: Daryl Coulthart :
"NQE will be stabilized at current release (3.3)"
Actively pursuing a partner (need to define scope).
Cray did NQE because they had to.
"NQE is mature".
Alternatives now: Codine, PVS, LSF, ...

Performance Tips for GigaRing Disk IO
Kent Koeninger, SGI/Cray

IPN:
Limit daisy chaining to a depth of 2.
Striping and chaining reduces performance.
JBOD gets ~50% of peak (recommend use of RAID).
DA-302 35mb/s sustained.
FCN:
240 mb/s read 160mb/s write, RAID (set of 5).
Sustained 48ish read, 32ish write.

BOF: T3E (Jim Grindle, SGI)

Wednesday Top General Index Handouts Mon. Tue. Wed. Thu. Fri. Tidbits

Cray Networking Update
Michael Langer, SGI/Cray mlanger@cray.com

New features and futures:

General Session

CUG Elections

Keynote Address:
25 Years of Computer Aided Engineering at Daimler Benz in Stuttgart, Germany
Michael Heib, Manager HWW

Does IT for others: 50% split of public (University) vs. industry at HWW.
Objectives:
More power at same cost; less operating personnel; smaller infrastructure costs; ability to solve very large problems;
bi-directional knowledge transfer between industry and University.
Academic & Industry: 2 very different cultures, took time to get it together.

Cellular IRIX: Plans & Status
Gabriel Broner, SGI/Cray

Running in house now.
Will be the resultant operating system for all (move from dual expertise to less duplication with more applications available).
Support for large systems: 64 to 4000 CPU's: fault tolerance, reliability, high-end features like checkpoints and accounting.
Support for server systems: 4-64 CPUS: general purpose workload; fault containment; different requirements from large.... constant availability.
Architecture: scalability and fault containment.

IRIX Accounting Limits and UDB Functionality
Jay McCauley, SGI/Cray

Cray style accounting (CSA) & udb limits in early 1999.
Requirements:
tools to manage large configurations; richer accounting facilities, udb for limits mechanism.
Architecture:
New database subset of udb; Initialization via PAM module; kernel enforcement.
Features:
Based on "job" container vs. individual process.
Partial list: cpu time, memory, vm, file size, open files, #threads, core file size, ...
Provide data capture with basic reduction and reporting.

SIG: High Performance Computing

8 participants, most with perfomance interest. Many viewed performance here as application or algorithms oriented vs. capacity planning and data center management which may be covered by Group 1 (not clear). Lots of TBD's.

Reminder: Use qsub's "-l mpp_t" Option

The "-l mpp_t" option allows you to request a specific amount of time for your MPP job to run. You should request, as closely as possible without going under, the actual time your job needs, and not simply request the maximum possible time for a given queue. Realistic requests improve job scheduling by both NQS and by the real people who manage the system. (Yes, we look at the time requests!)

Smaller mpp_t requests have a chance of running sooner. At ARSC, this is especially true if the request is under 30 minutes (which puts it into one of the "Quick" queues, which have the highest priority).

For help, see "man qsub" or:

http://www.arsc.edu/support/howtos/usingnqs.html

For more on ARSC's checkpoint procedures and queue policy, see "news chkpnt_sched" and "news queue_policy" on yukon.

TARGET Follow-up

[ Thanks to the reader who sent this response to last week's article on "TARGET." ]

Regarding the TARGET environment variable, another possibility would be to use the following TARGET setting:

setenv TARGET=cray-t3e,memsize=256M

See "man target".

Although your solution (TARGET=target) works for your machine (since all PEs contain identical amounts of memory), users of other T3Es, or your users if your T3E is eventually upgraded with PEs of differing size, may prefer this more general solution to allow them to specify which size PE to compile for.

Quick-Tip Q & A

A: {{ 
   
   In C, you don't need to specify the size of arrays at compile time
   (ie.  pointers are basically arrays).  So you could have a code
   fragment:

       double* x;
       double* y;

       for(i=0;i<SIZE;i++) {
               y[i] = alpha*x[i] + y[i];
       }

    How can you view C arrays in totalview? }}

  #   In the GUI version, double-click on the array you want to display.
  #   Then in the resulting data_object_window, double-click on "type"
  #   (or choose "Edit -> Type" from the menu). Then specify the type as
  #   <double>[SIZE]. The value of SIZE will probably need to be
  #   explicitly typed. For example, enter "<double>[100]" if SIZE ==
  #   100.


Q: A shell alias allows Unix users to create custom mnemonics and
  short-hands for commands or command strings. Two common aliases:


     alias ll='ls -lF'                 <korn shell syntax>

     alias mroe more                   <csh syntax>


  What's your favorite alias?


  (Send it in with a brief explanation. If you can't choose only one,
  send two--they're small.) 

[ Answers, questions, and tips graciously accepted. ]

 


Current Editors:
Thomas J. Baring ARSC Web Specialist ph: 907-450-8619
Donald Bahls ARSC User Consultant ph: 907-450-8674
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
Contact:
Send comments and questions to the current editors using this Contact Form.
Email Subscriptions: Archives:

 

Newsletter Index Quick-Tip Index Search Newsletters

 

Arctic Region Supercomputing Center
PO Box 756020, Fairbanks, AK 99775 | voice: 907-450-8600 | email:

home | search | about | support | news | science | resources