ARSC T3E Users' Newsletter 145, June 26, 1998

CUG Notes

ARSC staff members Kurt Carlson and Barbara Horner-Miller who attended the Stuttgart CUG were kind enough to share their notes with the T3E Newsletter. The first notes were written by co-editor, Guy Robinson:

Guy's Notes:

SV1. What is it?

The SV1 is CRAYs latest member in the vector supercomputer product line. Key features of the new line are:

Peak performance of 4 gigaflops from a single processor unit, which actually consists of four 1 gigaflop processors. These processors exploit vector cache memory which, with a suitable compiler, will increase effective memory bandwidth.

A symmetric multiprocessing architecture which builds larger systems by combining many cabinets. Scalable from entry level single cabinet systems of 32-gigaflops to multi-cabinet teraflop systems, and a current upper limit of 1 terabyte of memory.

A powerful suite of clustering tools to ease the management of such complex systems will also be provided.

Existing J systems can actually be upgraded to use the new fast processors, giving higher levels of performance but without some of the advanced clustering abilities, and Cray is expecting many sites to take this path.

With entry level systems priced at $500,000 and an aggressive trade in policy it is clear this system is intended to be the mainstream of the SGI/Cray scientific product line. More on the SV1 can be found at:

http://www.cray.com/products/systems/craysv1/intro.html.

Software tools

Several SGI/Cray staff described what would be happening in the current programming environment in the coming months/years. Along with a continual set of fixes and improved compatibility between the various SGI/Cray systems, key points are:

  • improved totalview debugger which will allow the inspection of message queues etc.
  • a new set of scientific libraries with a clearer structure overall which will ease user confusion over which libraries contain which algorithm and which are parallelized for which architecture.
  • in the longer term a single workshop of tools with a similar general look and feel across all platforms.

T3E news

SGI/Cray predicted continuing sales for the T3E systems in the coming year and reported it to be one of the most successful MPP platforms ever produced with several 1000+ node systems either in place or being delivered shortly and many systems with over 500 nodes in production.

A new internal network is now available which increases bandwidth from 350MBytes/sec to 420Mbytes/sec but other hardware changes are unlikely to occur unless there is demand from users. (The current hardware could accept a 750Mhz processor.)

A tutorial was held on the problems of scheduling MPP systems and several available products were reviewed. All present agreed there was no perfect solution and that there was much to be said for sites determining a best practice rather than seeking a holy grail.

Barbara's Notes:

Cray User Group (CUG) Report Stuttgart Germany June 14-18, 1998

This CUG was the first once-a-year-CUG and the first CUG with the new SGI management. It was the last CUG for several Cray Research corporate faces; fond farewells were bid to Bo Ewald and Irene Qualters. There was an election for four offices and a reorganization of the Special Interest Group (SIG) structure in addition to the regular general and parallel sessions.

CHANGING OF THE GUARD

Bo Ewald, Executive Vice President of SGI, and Irene Qualters, President of Cray Research, resigned from SGI a few weeks before the Stuttgart CUG. They attended the conference to say their good-byes to, and be honored by, CUG, an organization that had supported them, and been supported by them, for many years. The honoring of Bo and Irene took place at the Cray Reception on Monday night when they received tokens of remembrance presented by Gary Jensen, President of CUG. Bo was given a remote controlled Mercedes and Irene received a necklace and earring set. Both spoke briefly Monday night but their official good-byes were said at the close of the executive general session the next day.

Rick Belluzzo, President of SGI, addressed the Tuesday general session where he presented his direction for the company. He touched on the company definition (visualization, data management and computation), the market focus (time to insight), the product roadmap, operation/execution (clear responsibility and accountability through annual plans and tracking of metrics) and the business model (key industries). One of the more interesting slides in this series showed that supercomputers migrate downward: the supercomputers of yesterday are the servers of today and the desktops of tomorrow. Prior to addressing the General Session, Rick and others had held a news conference to formally announce the SV1, the J90 follow-on.

Beau Vrolyk and Earl Joseph II wrapped up the executive General Session with information on the SV1 announcement followed by Q & A. . SGI has orders for more than 500 processors. Specific information on the SV1 can be found on the SGI web pages beginning with:

http://www.cray.com/products/systems/craysv1/intro.html.

CUG ELECTION

Sally Haerer ran unopposed for CUG President. Sam Milosevich, ELILLY, won a very close race with Nick Cardo, SSD-SS, for Vice President. The race for Secretary, a position held for 11 years by Gunther Giorgi, GRUMANN, became dynamic when Gunther withdrew and Margaret Simmons, SDSC, petitioned onto the ballot. This resulted in a race between Margaret and Eric Greenwade, INEL. It was won by Margaret. Bruno Loepfe, ETHZ, won the race for Director of Europe over Michael Brown, EPCC. ??? The remaining members of the new Board of Directors were not up for re-election: Barbara Horner-Miller, ARSC, Treasurer; Shigeki Miyaji, CHIBA, Director of Asia; Barry Sharp, BCS, Director of the Americas. Gary Jensen, UIUCNCSA, completes the Board as Past President.

SIG REORGANIZATION

Following a recommendation by the Future of CUG Committee, the Board of Directors, reorganized the SIGs into a two-tiered structure. Under the new structure, there are five Group SIGs, each of which is comprised of several Focus SIGs. The Board appointed Chairs for each of the five SIGs and for many of the initial Focus Groups. On Thursday afternoon, the SIGs met to discuss organizational issues and to confirm the focus areas within the SIG. The SIG organization which will carry forward to the Minneapolis CUG next May will be

Computer Center Management Group Chair - Mike Brown
User Services Chair - Leslie Southern
Operations Chair -
Communications & Data Management Group Chair - Hartmut Fichtel
Mass Storage Chair -
Networking Chair - Hans Mandt
Operating Systems Group Chair - Chuck Keagle
UNICOS Chair - Ingeborg Weidl
IRIX Chair -
Security Chair - Virginia Bedford
Programming Environments Group Chair - Jeff Kuehn
Compilers & Libraries Chair - Hans-Hermann Frese
Software Tools Chair - Guy Robinson
High Performance Solutions Group Chair - Eric Greenwade
Applications Chair - Larry Eversole
Visualization Chair -
Performance Chair -

TIDBITS FROM SESSIONS

  • CUG has 195 members, $193,869.21 in the bank and 271 attendees in Stuttgart
  • Future CUGs are slated for
    • Minneapolis MI, USA for May 24-28, 1999
    • Nordwijk, NL for May 22-26, 2000
  • An Origin 2000 meeting will be sponsored by CUG this fall
  • The SV1 is not IEEE-based but the SV2 will be
  • Walter Wehinger was named Chief Information Manager for CUG
  • SGI will take a global view for training and try to put together classes that won't have enough participants at the regional level

SESSION NOTES

Hardware:

  • The SGI hardware organization is split into two parts: Vector Supercomputing Development under Steve Oberlin will concentrate on the T90, T3E, J90, SV1, SV2 and GigaRing; the Advanced Systems Development under Rick Barr will concentrate on the SN1, SN2 and XIO products. The support organizations span the Mountain View and Chippawa Falls sites with Barr's organization located in both sites.
  • The SV1 has multi-streaming. The SV1 and SV2 are expected to move forward at Moore's law or super Moore's law rates. The SV1e will have faster processors, reduced interconnect latency, increased GigaRing interconnect speed and higher bandwidth.
  • The name Cray will continue to be applied to high-end, computational products, e.g., Cray SV1.
  • The SN1 employs 2nd generation Origin DSM architecture, a more scalable router and more ports. Each link is twice the speed of the Origin or the T3E router. It's scalable to a thousand processors.
  • The SN2 is the 3rd generation. It will use the Merced follow-on from Intel and will have a faster hub. It will employ flexible network architecture and has an adjustable balance of PEs and routers to fit the applications. It will be air-cooled.

Software:

  • UNICOS 10 is Y2K compliant; it will have updates released on a 3-6 month interval; there will not be a UNICOS 11.
  • The SV1 will be supported through UNICOS updates.
  • UNICOS/mk is at the 2.0 level and will be updated on a 6-9 month interval with weekly archives. UNICOS/mk will be active until 2000 and maintained until 2004. Cray believes the software MTTI of UNICOS/mk to be more than 3000 hours. psched was added in 2.0.2. 2.0.3 brought the prime job concept, improvement to swap and the implementation of express message queues. Future activity includes the migration and checkpointing of swapped jobs as well as DCE and DFS implementation.

Service:

  • There are more than 2000 employees in the SGI service area with a 2% turnover rate. While a few locations and classifications are difficult to recruit, in general they find recruitment easy. The role of the local Support Manager is customer satisfaction, account management, and the work environment and moral of their employees. SGI is striving for a common support environment and tools between IRIX and UNICOS.

Training:

  • Leslie Southern described how OSC takes a 2-day, instructor-led training course and makes it available on the Web for self-paced instruction. The instructor puts materials on the Web in his/her preferred format and they are converted to html. The instructor uses a wireless microphone and a projection system. Real audio and video are recorded using the Web Lecture System, WLS from NC State University. The resulting class can be viewed as a class (sound and video), a review (printable notes) or sound (lecture only) on the Web.

Documentation:

  • Lynda Lester gave pointers on how to use the Web effectively; she had lots of examples, of both good and bad usage, to emphasize her points. Among the suggestions were to put the important stuff at the top, watch out for platform specific gottchas such as monitor resolution, color differences, tables and line spacing. Provide the viewer with alternative ways to navigate through the document: Table of Contents, Article Index and Search Engines are a few. Viewers would rather scroll than click so make links predictive (links with more words are better than shorter, more cryptic links).

Kurt's Notes:

Scheduling and Configuration Tuning for the T3E Jim Grindle, SGI/Cray, Mgr U/mk Engineering jsg@cray.com

Tutorial covering GRM (global resource manager) and psched (political scheduler).

psched consists of GS (Gang Scheduler), LB (load balancer), and MUSE (multi-layered user scheduling environment). GS & LB have seen lots of fixes with 2.0.3 of U/mk.

  • Futures:
    1. looking at 'exit' for GRM making external requests;
    2. GRM queue: prioritization, starvation
    3. single PE applications
    4. optimization

    Performance Optimization for the Origin 2000 Jeff Brooks, SGI/Cray Benchmarking Department, jpb@sgi.com

Note in Origin floating divide by zero does not cause program fault by default. Note comparisons for tuning parallel code for origin. Note F90 option: -PHASE:flist on (for debugging, presently undocumented). See information on OpenMP (recommended standard).

General Session

  • 272 Attendees, 80 SGI employees remainder split U.S. vs. Europe|Pacific.
  • 195 CUG members (increase of 1).
  • May 24-28, 1999 in Minneapolis.
  • May 22-26, 2000 in Norwijk, Netherlands
  • See http://scv.bu.edu/SCV/Origin2000 .
  • Future of CUG: 12 SIGs to 5 Super-SIGs:
    1. Reduce overlap
    2. Facilitate technical program
    3. Involve more CUGgers
    4. Flexibility
    5. Group Chairs with subordinate Focus Chairs

SGI/Cray Monitoring Tools Randy Lambertus, SGI/Cray, rl@sgi.com

Proposed future integration of support tools (e.g., vaporware):

  • SSS: System Support Software, 3 command sets:
    1. System Support Manager: Command Module, Decision Support, Monitor & Notify, Event Handler, Support Database
    2. System Group Manager: Manage multiple systems... Group event tracking, Config mgt., Availability monitor, Notification based on group system events.
    3. System Support Console: gui and ascii interfaces; launch and configure; control notifiers and reports.

Proposed to be available next year for Irix; written for NT as well. Intent is to provide for U/mk (unknown: "Efforts all directed to IRIX now").

Industry Directions in Storage Mike Anderson, SGI/Cray

Seagate & IBM are primary players in high performance disks. Quantum bought out by Matsushita (sp). Market dominated by desktop (70% of units), roughly 17% is high performance and 13% mobile. CD-RW likely to takeover CD market by 2000-2001. Industry has not accepted IBM SSA disks. Fibre channel-0wid has industry acceptance. Capacities growing (expect 40gb drives by end of 1998). MTBF measurement varies... for some it's when 2/3 have failed, for some it's measured by returned failed drives (many of which are just thrown away so by measurement they're still ticking); useful life expectancy is 5 years, but economic life may be less than actual life. LTO (Linear Tape Open) ( www.lto-technology.com ) is a new emerging media type/standard... near term expect 100gb capacity, expect 800gb futures. Super-DLT (100gb/cart) also should be out by 1999, SGI will support when available. STK Eagle will be released June 1998, SGI will need 4 months for validation testing once released.

General Session

  • Bob Ewald (SGI COO) and Irene Qualters (Cray Research President) are leaving.

SGI/Cray Research Corporate Vision Rick Belluzzo, CEO, SGI

  • SGI Focus

Data management; Visualization; Computation. Want to dominate "Time to insight" (modeling and simulation). Operational changes to improve efficiency. Change business model (profitability). Execution & results: clear responsibilities and accountability.

  • HPC

See convergence of vector and traditional. 6 key industries: Mfg., Gov., Ent./Media, Energy, Science, Com. Execution & results: clear responsibilities and accountability.

  • CUG - SGI committed to CUG.

SGI/Cray Research Corporate Operations Report Beau Vrolyk, SGI

Review of SV1 announcements... Already sold 20m system, orders for 500 processors already. Entry price of .5m (cheaper), based on J90 technology. J90->SV1->SV1e->SV2. Targeted core markets. 5x performance of J90. 5x price performance improvements over J90.

OpenMP Programming Model Ramesh Menon, SGI/Cray menon@sgi.com See: http://www.sgi.com/Technology/OpenMP and http://www.openmp.org

Motivation: no portable standard for shared memory parallelism, each vendor had proprietary SMP. SGI is leading, joined by HP, Intel, Sun, IBM, DEC, etc. Presently a consortium, incorporating as a non-profit. Fortran v1.0 spec due out 10/98ish. C/C++ v1.0 spec due out 8/98ish. Validation suite for Fortran planned.

Salient features: fine & coarse parallelism; incremental parallelization; provide access to strengths of shared memory (e.g., avoid message passing); exploit cache coherent scalable hardware.

Interoperability: can mix with MPI and PVM. shmem & pthreads not supported initial version.

Due to architecture, will NOT be supported ever on T3E.

OpenMP: a Multitasking and Autotasking Perspective, Neal Gaarder, SGI/Cray

Direction towards OpenMP: standard, preferred alternative. All IRIX compilers (7.2). PVP 10.0.0.3 and PE 3.1. Not for YMP (10.0 required) or T3E.

Conditional compilation: #ifdef_OpenMP !$, c$, *$ directives.

Conversion to OpenMP: gradual (intermixing directives).

Paper has more details.

Update of System Management Software for Large Origin Systems Dan Higgins, SGI/Cray

Data center quality and HPC functionality into IRIX. Share II (fairshare) in IRIX 6.5. Miser in 6.5 (miser API a future). Checkpoint/Restart: CPR 1.0 in Irix 6.4. CPR 1.1 w/6.4 update (pthreads and fixes). CPR 1.2 w/6.5 (shmem). Resource limits (udb-like capabilities). Accounting 1H99 Cray-style project ids & reporting (csa). Enhancements to Array Services (Irix clusters).

NQE: Daryl Coulthart dbc@cray.com : "NQE will be stabilized at current release (3.3)" Actively pursuing a partner (need to define scope). Cray did NQE because they had to. "NQE is mature". Alternatives now: Codine, PVS, LSF, ...

Performance Tips for GigaRing Disk IO Kent Koeninger, SGI/Cray

IPN: Limit daisy chaining to a depth of 2. Striping and chaining reduces performance. JBOD gets ~50% of peak (recommend use of RAID). DA-302 35mb/s sustained.

FCN: 240 mb/s read 160mb/s write, RAID (set of 5). Sustained 48ish read, 32ish write.

BOF: T3E (Jim Grindle, SGI)

  • Completed all the engineering development work (e.g., "Mature").
  • Last was support for 600mhz: T3E/1200.
  • new router chip (increased bandwidth o ninterconnect).
  • shipped a 1024 PE system with 512mb/PE to US Govt.
  • 6 orders for T3E/1200.
  • surprised at success of T3E: $500m of orders vs. $150m on T3D.
  • planning on making T3E's another 18ish months.
  • 5 or 6 systems exist of 768 PE's or greater.
  • Don't plan on building another Alpha based machine.
  • Smaller T3E sites may be interested in SN1, larger in SN2.
  • If 750mhz comes out on Alpha, may implement.
  • Still working on (looking at) dynamic remapping of PE's.
  • Working on boot speed (parallelizing file checks is primary focus).
  • Looking at user exit to load balancer.

Wednesday Top General Index Handouts Mon. Tue. Wed. Thu. Fri. Tidbits

Cray Networking Update Michael Langer, SGI/Cray mlanger@cray.com

New features and futures:

  • host-to-host TCP/IP (e.g., on gigaring... avoids ION & Rings). J90->T3E 50mb/s (HIPPI 30ish mb/s). Need 2.0.3 (5/98), need 2.0.4 T3E->T3E (11/98), need SWS-ION 3.0 (6/98), need Unicos 10.0.0.2 (5/98).
  • bulk data services (BCS) ported from IRIX: J90SE->Origin 24mb/s. Improvement over NFS3. Supported in 2.0.3 and 9.3+.
  • socket server assistant
  • snmp v2
  • unified name service
  • NIS+
  • Futures: gb ethernet (in 6.4, soon 6.5); ATM OC-12 (~9/98); HIPPI 6500/ST (~6/99 and ~12/99 for Scheduled Transfer); IPv6 (~6/99); Network Node Manager.

General Session

CUG Elections

Keynote Address: 25 Years of Computer Aided Engineering at Daimler Benz in Stuttgart, Germany Michael Heib, Manager HWW

Does IT for others: 50% split of public (University) vs. industry at HWW. Objectives: More power at same cost; less operating personnel; smaller infrastructure costs; ability to solve very large problems; bi-directional knowledge transfer between industry and University. Academic & Industry: 2 very different cultures, took time to get it together.

Cellular IRIX: Plans & Status Gabriel Broner, SGI/Cray broner@cray.com

Running in house now. Will be the resultant operating system for all (move from dual expertise to less duplication with more applications available). Support for large systems: 64 to 4000 CPU's: fault tolerance, reliability, high-end features like checkpoints and accounting. Support for server systems: 4-64 CPUS: general purpose workload; fault containment; different requirements from large.... constant availability. Architecture: scalability and fault containment.

IRIX Accounting Limits and UDB Functionality Jay McCauley, SGI/Cray mccauley@cray.com

Cray style accounting (CSA) & udb limits in early 1999.

Requirements: tools to manage large configurations; richer accounting facilities, udb for limits mechanism.

Architecture: New database subset of udb; Initialization via PAM module; kernel enforcement.

Features: Based on "job" container vs. individual process. Partial list: cpu time, memory, vm, file size, open files, #threads, core file size, ...

Provide data capture with basic reduction and reporting.

SIG: High Performance Computing

8 participants, most with perfomance interest. Many viewed performance here as application or algorithms oriented vs. capacity planning and data center management which may be covered by Group 1 (not clear). Lots of TBD's.

Reminder: Use qsub's "-l mpp_t" Option

The "-l mpp_t" option allows you to request a specific amount of time for your MPP job to run. You should request, as closely as possible without going under, the actual time your job needs, and not simply request the maximum possible time for a given queue. Realistic requests improve job scheduling by both NQS and by the real people who manage the system. (Yes, we look at the time requests!)

Smaller mpp_t requests have a chance of running sooner. At ARSC, this is especially true if the request is under 30 minutes (which puts it into one of the "Quick" queues, which have the highest priority).

For help, see "man qsub" or:

http://www.arsc.edu/support/howtos/usingnqs.html

For more on ARSC's checkpoint procedures and queue policy, see "news chkpnt_sched" and "news queue_policy" on yukon.

TARGET Follow-up

[ Thanks to the reader who sent this response to last week's article on "TARGET." ]

Regarding the TARGET environment variable, another possibility would be to use the following TARGET setting:

setenv TARGET=cray-t3e,memsize=256M

See "man target".

Although your solution (TARGET=target) works for your machine (since all PEs contain identical amounts of memory), users of other T3Es, or your users if your T3E is eventually upgraded with PEs of differing size, may prefer this more general solution to allow them to specify which size PE to compile for.

Quick-Tip Q & A


A: {{ 
   
   In C, you don't need to specify the size of arrays at compile time
   (ie.  pointers are basically arrays).  So you could have a code
   fragment:

       double* x;
       double* y;

       for(i=0;i<SIZE;i++) {
               y[i] = alpha*x[i] + y[i];
       }

    How can you view C arrays in totalview? }}

  #   In the GUI version, double-click on the array you want to display.
  #   Then in the resulting data_object_window, double-click on "type"
  #   (or choose "Edit -> Type" from the menu). Then specify the type as
  #   <double>[SIZE]. The value of SIZE will probably need to be
  #   explicitly typed. For example, enter "<double>[100]" if SIZE ==
  #   100.


Q: A shell alias allows Unix users to create custom mnemonics and
  short-hands for commands or command strings. Two common aliases:


     alias ll='ls -lF'                 <korn shell syntax>

     alias mroe more                   <csh syntax>


  What's your favorite alias?


  (Send it in with a brief explanation. If you can't choose only one,
  send two--they're small.) 

[ Answers, questions, and tips graciously accepted. ]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top