[Menu Bar] Resourses at ARSC Science at ARSC Newsroom Support About ARSC ARSC Home

 

ARSC HPC Users' Newsletter 243, April 12, 2002

Newsletter Index Quick-Tip Index Search Newsletters

Contents

 

Etnus Totalview 5.0 Installed on Icehawk

ARSC has just installed the Etnus Totalview 5.0 parallel debugger on its IBM SP Cluster, icehawk.

To use it, add this:

/usr/local/adm/pkg/flexlm/license.dat
to the settings of your LM_LICENSE_FILE environment variable. Recompile with the -g compiler option, and launch totalview against the resulting executable. E.g.,
icehawk1$ totalview ./a.out
Totalview documentation is built in. Click on "Help".

(When the bugs have been exterminated and you start production runs, be sure to recompile again WITHOUT the -g. -g disables optimizations and kills performance.)

 

Upcoming Visitors, Talks, and Events

Visitors:

ARSC is hosting several visitors in the next couple of weeks to participate in evaluation of large-scale storage systems. They include Robert Bell (Bureau of Meteorology / CSIRO), David McGee (NAVO MSRC), and Roy Campbell (ERDC MSRC).

The visitors will all be giving presentations open to the wider UAF community.

Talks:

Title: "Our NEC of the Woods: Oz, CSIRO, HPCCC and NEC"
Date: Thursday 25th April 2002
Speaker: Dr. Robert Bell

Abstract: Biographical Note:

As soon as it's available, the full schedule of talks will be posted on the "Hot Topics" section of, http://www.arsc.edu.

Events:
ARSC Faculty Camp: WOMPAT 2002 at ARSC/UAF, Aug 5-7: Student Employment Opportunities: ARSC Summer Tours:

 

SV1 Craylib Problem Isolated

In issue #224 , we noted this:
    > Unresolved issues in two SV1 user codes were cleared up recently when
    > the users switched back from the default craylibs version to craylibs
    > 3.3.0.2. It is suspected that this is an issue with the FFT routines,
    > but investigation is ongoing.
    >
    > If you feel the need to try this, you should use the command:
    >
    >     module switch craylib.3.3.0.2

Through a lot of difficult trouble-shooting on one of the two user codes, Tom Logan of ARSC was able to narrow the problem down to the LAPACK routine, CTRSM. CTRSM is a low-level routine. The code accesses CTRSM through a call to the LAPACK routine, CHEGST. The problem manifested itself by a failure of the algorithm to converge when run on 1-CPU, but correct and repeatable results when run on multiple CPUs.

It turns out that Cray already had a problem report open on CTRSM, a fix is in testing and will be integrated in a future release of craylib. From the Cray SPR, a more precise statement of the problem: "For the argument N > 64 and odd, the libsci_sv1 version of CTRSM gives wrong answers."

For now, you can download the netlib LAPACK fortran source of ctrsm.f, and add it into your compilation. ARSC users can contact for this fortran file. Linking your own ctrsm.o file will preempt the libsci routine of the same name and thus allow you to safely use everything else from the latest installation of craylib (which, on chilkoot, is release 3.5.0.1).

As usual, please report problems and mysteries to us ().

 

UAF Computational Physics Programs

The Sloan Foundation maintains a web page listings for a number of Physics and Chemistry Masters programs:

http://www.sciencemasters.com/science.php

It now includes UAF's Computational Physics program:

 

Quick-Tip Q & A

Bonus! Two answers in one week!
 
A:[[ SV1 Totalview question from last issue... How to view entire 
  [[ automatic arrays, like:
  [[      COMPLEX         Z( LDZ, * )

  # 
  # Thanks to Ed Anderson:
  # 

  To view the full array, dive on one of the variables, such as A.  The
  window shows "Type: COMPLEX(147,1)".  Click on COMPLEX(147,1) (or go
  to the Edit->Type menu), and change the type to COMPLEX(147,147).  The
  data object window should update automatically.  You might find it
  easier to view the array with the menu option Display->Array Browser.

  Unfortunately, the debugger doesn't remember this info when you close
  the data object window.

  # 
  # Editor's Note:
  # 
  # This works on the T3E, fails on the SV1.  It looks like a problem
  # with the SV1 totalview, but is under investigation.
  # 



A:[[ I've been connecting remotely to an SGI Octane2. I use the DISPLAY
  [[ environment variable to export the X Windows display back to my
  [[ personal workstation.
  [[
  [[ For some reason, when I sit down at this SGI and log onto the
  [[ console, the screen flashes, and it immediately logs me off. I'm
  [[ definitely not over-quota, my account is active, and everything works
  [[ perfectly when I connect remotely again.
  [[
  [[ Any ideas what's up?

  # 
  # Thanks to Richard Griswold:
  # 

  I had a similar problem when my home directory was shared between Linux
  and AIX.  Something about the .Xauthority file was different between the
  two OSes, so when I logged in on the console of one system, I had to
  delete the file before I could could log in on the console of the other
  system and run X apps.
  
  You could check for this file and try deleting it.


  # 
  # Editor's answer:
  # 

  The specific incident hit an ARSC user because he had put this:

    setenv DISPLAY <HIS_PERSONAL_WORKSTATION>

  in his SGI .cshrc file. When logging directly onto the SGI at the
  console, the IRIX window manager objected to its display being sent
  elsewhere, and bailed.



Q: Am I going nuts? 

    $ ls -d DATA
      DATA
    $ ls DATA
      D2001  index.txt
    $ cd DATA
      sh-56 ksh: DATA: not found.
  
   Why is it doing this to me?

[[ Answers, Questions, and Tips Graciously Accepted ]]

 

 


Current Editors:
Donald Bahls ARSC User Consultant ph: 907-450-8674
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
Contact:
Send comments and questions to the current editors using this Contact Form.
E-mail Subscriptions: Archives:

 

Newsletter Index Quick-Tip Index Search Newsletters

 

Arctic Region Supercomputing Center
PO Box 756020, Fairbanks, AK 99775 | voice: 907-450-8600 | email:

home | search | about | support | news | science | resources