ARSC T3E Users' Newsletter 167, May 4, 1999

Robert Numerich Talk at ARSC on Co-Array Fortran, 2pm May 6

Bob Numrich of SGI will be visiting ARSC on Thursday, May 6, and will give a presentation on Co-Array Fortran at 2pm in 204 Butrovich Building.

Bob, together with John Reid of the Rutherford Appleton Laboratory, designed Co-Array Fortran. This simple syntactic extension to Fortran 95 converts it into a parallel language. It is implemented in the CF90 compiler for the T3E, and has been available to ARSC users since last September.

A full description of Co-Array syntax appeared in Numrich and Reid (1998), Co-Array Fortran for Parallel Programming, ACM Fortran Forum, volume 17, no 2, pp 1-31.

The paper is available in postscript at:

ftp://matisa.cc.rl.ac.uk/pub/reports/nrRAL98060.ps.gz

Co-Array Fortran has appeared frequently in recent issues of this newsletter. More information is also available at:

http://www.co-array.org/

Review of 1999 PTools Annual Meeting

[ Written by Tom Baring. ]

The Parallel Tools Consortium (PTools) 1999 Annual Meeting took place at NCAR, April 19-21.

It brought about 80 people from the vendor, user, and academic communities together. The resulting perspectives sparked lively, informative discussion.

For instance, on the issue of open source, some lauded the power of the internet community to generate code and rapidly fix bugs. Others observed that a gate keeper is essential to verify and integrate changes and to prevent feature bloat. Others stated that software engineers must get paid for their efforts--and observed that the pool of parallel tools users/developers may be too small for the open source model to work (as it has for linux, for instance).

Another example: On the issue of common tools across multiple vendor platforms, most agreed that users are more likely to use tools if they're consistent across all platforms. Some observed that this model is already working (witness Totalview, VAMPIR, and various standards). Others noted that the survival of vendors, and the survival of software divisions within companies, is often tied to differentiating themselves--not to duplicating each other's products. (But, someone asked, does such differentiation apply to secondary products like parallel tools, or only to hardware?)

Such dialog helps improve understanding throughout the community. It helps PTools realize its mission: "to take a leadership role in defining, developing, and promoting parallel tools that meet the specific requirements of users who develop scalable applications on a variety of platforms."

The PTools web site is:

http://www.ptools.org/

What follows are some notes from the formal program. (They're not complete--just some points from most, but not all, of the talks. Feel free to send corrections/enhancements.)

European freeware tools Maria Carla Calzarossa (University of Pavia)

The speaker introduced projects and organizations active in Europe. They included:

GUARD - A Portable Debugger for Portable Applications David Abramson (Monash University)

GUARD is a tool for "relative debugging." This compares data between two executing programs. It was devised to aid the testing and debugging of programs that are either modified in some way, or are ported to other computer platforms.

See:

http://www.dgs.monash.edu.au/research/guard

User Perspective: Portability at DOD Dave Rhodes

The speaker described his experience using VAMPIR to successively (and successfully) improve a particular algorithm and implementation for task distribution.

He commented that "embarrassingly parallel" problems are not always good candidates for MPP platforms, as they might be solved more cheaply on workstation clusters. Medium-grained problems generally make better use of expensive HPC resources.

Tutorial: Performance Analysis and Compiler Optimizations Phil Mucci (University of Tennessee and Sentient Research), Kevin London (University of Tennessee)

The speakers recommended compiler optimization flags for C/C++ and Fortran 90 on the SP2, Origin2000, and T3E, and discussed performance tools. For the T3E, this meant Apprentice and PAT.

Performance issues for Fortran 90 (on any platform).

  • F90 code generally runs slower than equivalent F77 code.
  • Don't use F90 features in compute kernels.
  • F90 array syntax hurts performance.
  • WHERE statement is bad (hides a conditional).
  • CSHIFT statement is bad (hides a branch).
  • Derived types are good when used correctly (improve spatial locality of related data).

MPI issues:

  • Use contiguous data structures or MPI_TYPE_STRUCT.
  • Don't use PACK/UNPACK.
  • Post receives before sends.
  • Send BIG messages.
  • Use MPI_[I]rsend (ready send) if possible.

For more, see:

http://www.cs.utk.edu/~mucci/MPPopt.html

SLOGging Towards Teraflops: A Strategy and Library to enable Jumpshot and Other Tools to Cope with Gigabyte Event Log files Ewing "Rusty" Lusk (Argonne National Laboratory)

"SLOG" is "Scalable LOGging". The structure of SLOG trace files enables the viewer, Jumpshot, to produce various overviews of the entire volume of data. Users will be presented with greater detail as they zoom in, rather than seeing an expanded view of the same level of detail.

Jumpshot is written in Java.

Scalability Issues in Tracing and Visualizing Large MPI Application Shirley Browne (University of Tennessee)

The current method for controlling the size of trace-based MPI tool output is to instrument the code. The user can turn tracing off/on for particular processes or code sections.

The ARSC tutorial shows VAMPIR users how to do this:

http://www.arsc.edu/support/howtos/usingvampir.html#adv_limit_data

Goals for the future:

  • Control tracing at runtime.
  • Automatically relate traced events to code segments without instrumenting the code (as VAMPIR does on a limited number of hosts, not including the T3E.)
  • Visualize traced data in a scalable manner.

User Perspective: Robustness/Verification and Visualization Mike Frese (NumerEx)

The speaker raised several issues:

  • Comparing model output with known analytic or experimental results is needed to verify the model, but, in real life, is often a difficult, frustrating experience.
  • Save all test results, codes, compiler version information, etc...
  • Good features of a debugger: checkpoint/restart within the debugger, macro execution capability, improved debugging of optimized code.

LLNL tool for managing model data Celeste Matarazzo (Lawrence Livermore National Laboratory)

The speaker described an ASCI project to help deal with the crush of archived data. The project has two thrusts: 1) dealing with existing data, and 2) better managing future archives.

Old (existing) data:

  • A tool snoops through existing data archives.
  • "Meta-data", or data about the data, is extracted and organized.
  • Users may garner more from the new information than from the old filenames and notes (if they even exist).
  • The hope is that multiple Terabytes of old data will become useful (or will become obviously safe to delete).

New data:

  • A tool helps users create meta-data for files when they're initially archived. This may include thumbnail images of graphical data.
  • Given this useful, possibly visual, database of information about the archives, they are more likely to be used later.

The tools are written in Java so users need no additional software. See:

http://www.ca.sandia.gov/asci-sdm

Overview of PTools Projects

  • PAPI -- Phil Mucci (University of Tennessee and Sentient Research)
  • MUTT -- Jack Horner (SAIC)
  • HPDF -- Kevin London (University of Tennessee)
  • DPCL -- Judy Ingles (IBM)

See:

http://www.ptools.org/projects.html

SGI NT Workstation Debut at UAF

On May 6 SGI will demo its new line of Windows NT workstations in 109 Butrovich. You may register to attend, at:

http://www.sginw.com/register.html

Announcements

1999 Arctic Science Conference. Call for Abstracts, see:

http://www.cgc.uaf.edu/

1999 CUG. Final program is available. Note new SV1 tutorial. See:

http://www.cug.org/

IEEE CS Task Force on Cluster Computing

New IEEE group. From the web site intro:


> The TFCC will act as an international forum to promote cluster
> computing research and education. It will also participate in helping
> to set up and promote technical standards in this area.  The Task Force
> will be concerned with issues related to the design, analysis,
> development and implementation of cluster-based systems. Of particular
> interest will be cluster hardware technologies, distributed
> environments, application tools and utilities, as well as the
> development and optimisation of cluster-based applications.
> 
> The TFCC will sponsor professional meetings, publish newsletters and
> other documents, set guidelines for educational programs, as well as
> help co-ordinate academic, funding agency, and industry activities in
> the above areas. The TFCC plans to organize an annual conference and
> hold a number of workshops that would span the range of activities
> sponsored by the Task Force. In addition, a bi-annual newsletter would
> be published to help IEEE/Computer Society members keep abreast of the
> events occurring within this field.
> 
>   
http://www.dcs.port.ac.uk/~mab/tfcc/

Quick-Tip Q & A


A:{{ The following is legal input to a Unix command:

      [la1+dsa*pla10>y]sy
      0sa1
      lyx

    What's the command, and what's the result from this input?  (Hint:
    the command is an easily mistyped anagram of another, extremely
    popular, command.) }}


    #  Tom Parker of NCAR got this one, and even noticed we'd swiped
    #  the example from the man page. The command is:
    # 
    #    dc
    # 
    #  "dc" is for "desktop calculator." It's an arbitrary precision
    #  reverse Polish notation calculator.  Here's the example, which
    #  computes the first 10 factorials:
    #   
    #  $ dc
    #    [la1+dsa*pla10>y]sy
    #    0sa1
    #    lyx
    #  1
    #  2
    #  6
    #  24
    #  120
    #  720
    #  5040
    #  40320
    #  362880
    #  3628800
    #  
    #  
    #  The next example prints 1000 (base 10) in bases 2, 4, 8, 10, 
    #  16, and (for grins) 19, and 32:
    #  
    #  $ dc
    #  1000 2op 4op 8op 10op 16op 19op 32op
    #  1111101000
    #  33220
    #  1750
    #  1000
    #  3E8
    #  2EC
    #  V8
    #  
    #  
    #  This example prints the result of 355/113 with 60 digits precision:
    #  
    #  $ dc
    #  60k
    #  355
    #  113
    #  /
    #  p
    #  3.141592920353982300884955752212389380530973451327433628318584
    #  



Q: You're not sure if you compiled with Apprentice, PAT, or VAMPIR
   enabled in your current executable. How can you find out?

[ Answers, questions, and tips graciously accepted. ]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top