ARSC T3D Users' Newsletter 108, October 11, 1996

No Newsletter Next Week

[ But the following week: news from the Charlotte CUG. ]


Including a call to "shmem_set_cache_inv()" at the beginning of your program makes shmem_put() safe (at least you won't have to worry about cache coherence). Given that shmem_put() is about three times faster than shmem_get() (see Newsletter #56 ), this could be a nice optimization option, if you've been nervous about using it.

We've discussed shmem_set_cache_inv() before in the Newsletter (see #59 ), but a reader suggested I cover it again, given its importance.

First off, what does it do? Keeps the data in cache consistent with the data in memory -- even if a shmem_put() operation comes along and changes the data in memory. As an example, here's a test program given to me by Jason Wynne Parsons, a research assistant at ARSC:

*    Here is a really short example which shows the result of using
*    shmem_set_cache_inv().  You could also use shmem_udcflush() after
*    the second pvm_barrier() call to obtain the same results.  Try
*    running this with and w/o the cache control fxn and you'll see the
*    results immediately.  I tested the code with 8 PEs.

#include <stdlib.h>
#include <stdio.h>
#include <mpp/shmem.h>
#include <pvm3.h>

{ int num_PES = pvm_gsize(0);
  int my_PE = pvm_get_PE(pvm_mytid());
  long var;
  int pe;

  /* Include for shmem_put to work correctly */
  /*  shmem_set_cache_inv();  */

  pvm_barrier( NULL, num_PES );
  if( my_PE == 0 )
  { var = num_PES;
    for( pe = 1; pe < num_PES; pe++ )
      shmem_put( (long *)&var, (long *)&var, 1, pe );
  pvm_barrier( NULL, num_PES );
  printf( "PE_%d: var = %d\n", my_PE, var );

  return( 0 );

Here is the output from two runs. The first with the shmem_set_cache_inv() call commented out, as above. The second with it active:

 denali$ cache_inv.OFF
  PE_2: var = 8
  PE_1: var = 416611824824
  PE_4: var = 416611824824
  PE_5: var = 416611824824
  PE_6: var = 416611824824
  PE_7: var = 416611824824
  PE_0: var = 8
  PE_3: var = 416611824824
 denali$ cache_inv.ON
  PE_0: var = 8
  PE_1: var = 8
  PE_2: var = 8
  PE_4: var = 8
  PE_5: var = 8
  PE_6: var = 8
  PE_3: var = 8
  PE_7: var = 8

Documentation on shmem_set_cache_inv() is available via man and docview. Also, as mentioned in earlier issues of the Newsletter, CRI has written T3D optimization papers which are available from the ARSC anonymous FTP server. Here's one which covers shmem_put() and shmem_set_cache_inv().

    File Name:  shmem.ascii
    FTP site: 
    Directory:  pub/mpp/docs/

    An Excerpt:

  The Get-It-Right rules:

    (2) Cache coherency on CRAY T3D remote processors can only be
        guaranteed when the cache invalidate filter is set on the
        receiving PEs by calling the shmem_set_cache_inv() or
        shmem_set_cache_line_inv(...) routines prior to data transfer
        or when full or partial cache flushing is invoked by
        shmem_udcflush() or shmem_udcflush_line(...) before attempting
        to use any data received.

  The performance rules:

    (1) When possible, use puts rather than gets.  The put routines are
        asynchronous and allow continuation of work by the calling
        routine.  The get routines stall the calling PE until the
        requested data is in local memory.

Memory Requests in NQS

[ Jayashree Harikuma of ARSC provided this. ]

Here's a sample output from "qstat -a:"

 ------------- ------- ----- ----------------- ---- ---- ------ ------ ---
 5815.denali   R01d01D ZZZ   m_64pe_24h@denali       707   4096  10800 Qge

REQMEM: represents the per-request memory limit for a request awaiting execution; the current memory usage (expressed in Kilowords) for an executing request or ** for a request with unlimited per-request memory.

The REQMEM column output you see from the output of qstat stands for the memory requested on the Y-MP. If you want more memory on the T3D you ask for more PEs. When NQS sees that you have asked for a large memory on the Y-MP it takes into consideration the load on the Y-MP. If a user wants about 16 Mw memory on the T3D he should ask for 2 PE. If he just specifies that he wants 16 Mw memory NQS will assume that he wants it for the Y-MP and prioritise his job accordingly.

Job Openings

ARSC is looking to fill these positions:

  • Director of User Services
  • HPC Vector Specialist
  • Visualization Specialist
  • HPC Systems PRogrammer/Analyst IV

For details, visit:

ARSC's jobs site

Also, One of our subscribers, George Delic of NESC, asked me to put in this announcement. I've shortened it substantially:

 ----------- Job Opening ------------------------------------------------
 Position Title:   Scientific Applications Consultant
 Location: Lockheed Martin Services Group, Bay City, Michigan
 Desired Qualifications:
        Education:  MS degree
        Hardware:   Cray (C90, T3D) and UNIX workstations.
        Languages:  FORTRAN and C.
        Software Applications:   Fluid Dynamics.
 Job Responsibilities:
    Serves as a user consultant to EPA and scientific research users of
    the National Environmental Supercomputing Center (NESC).  Acts as a
    primary contact person for the NESC scientific users.
For the complete job announcement or if you have any other questions,

 George Delic, Ph.D.,             

 Snr. Scientific Computing Consultant,

 NESC, Lockheed Martin,               

 Technical Services, Inc.,            

 135 Washington Avenue,               
    Tel : 517-894-7662              

 Bay City, MI 48708-5845              
    Fax : 517-894-7676              

Quick-Tip Q & A

A: # Thanks to Barbara Herron of LLNL who provided this tip:  

   {{ How do you figure out what version of a library your code has been
      loaded with, or what version of a library you are using? }}

   what a.out          # Prints SCCS version info for all 
                       # library routines linked. You can pipe 
                       # the output to grep, etc...  This could be
                       # really handy, especially if you found that
                       # there was a bug in some library routine, 
                       # and you wondered if you had used that version.

Q: What mode should you give a directory that you own, so that members
   of your unix permission group can create files in it, edit their own
   and other members' files, remove and rename their own files, but not
   remove or rename files owned by anyone else?

[ Answers, questions, and tips graciously accepted. ]

Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top