ARSC HPC Users' Newsletter 354, January 12, 2007

Fun with NCO

[[ Thanks to Kate Hedstrom of ARSC for this article! ]]

NCO is the NetCDF Operators package from http://nco.sourceforge.net/ . It is a collection of executables for manipulating NetCDF files from the command line or from shell scripts. I've mentioned it before, but I had an opportunity to learn more about it today and thought I would share it with you.

The situation is that I was creating a file from one model run to be used as the boundary condition in a nested domain. We have the Matlab scripts to do this, but I was running out of cpu time when running Matlab on all 500 input files. I thought I would get clever by running it on 100 at a time, then concatenate them somehow when I was done. Imagine my surprise when realizing the boundary condition file contained five different time variables, none of them the record variable. If all had used the same time variable and if the time was a record variable, it would have been trivial to use ncrcat (the record concatenator).

I found a way to do the concatenation anyway, still using NCO rather than having to write a program to do the job in NCL or (horrors) Fortran. The job requires several steps:

  1. Extract all the fields using one of the time variables (say salt_time) from each of the five files.
  2. Append a record dimension to the new files.
  3. Switch dimensions so that salt_time becomes the new record dimension.
  4. Concatenate along the record dimension salt_time.
  5. Swap the dimensions back so that salt_time is no longer the record dimension.
  6. Get rid of the useless record dimension.
  7. Do the above for each of the five time variables, then glue everything together into one file.

The first three steps need to be done for each of the five files I started with so they can be put into a do loop. The field we're dealing with can also be made into a variable so that the script can be run for each value. Here is the script I came up with:


#!/bin/sh
# NCO script for concatenating boundary files along the various
# (non-record) time dimensions.

field=salt

for idx in 01 02 03 04 05; do
# extract relevant fields
     ncks -v ${field}_time,${field}_west,${field}_north,${field}_south RCCS_bry_${idx}.nc ${field}_${idx}.nc

# create record dimension
    ncecat -O ${field}_${idx}.nc rz_${idx}.nc

# switch ${field}_time to record dim
    ncpdq -O -a ${field}_time,record rz_${idx}.nc rz_${idx}.nc

# clean up
    rm ${field}_${idx}.nc
done

# concatenate the time records
ncrcat rz_??.nc ${field}.nc

# clean up
rm rz_??.nc

# swap the dimensions back
ncpdq -O -a record,${field}_time ${field}.nc ${field}.nc

# get rid of the unused record dimension
ncwa -O -a record ${field}.nc ${field}.nc

This script creates the file salt.nc when $field=salt, doing the first six steps above. It is then run again for the other values of $field (the velocities have a similar but different script). Once you have the five files containing the entire timeseries, paste them together with ncks:


mv salt.nc boundary.nc            # rename the file
ncks -A temp.nc boundary.nc       # append the temperature fields
ncks -A v2d.nc boundary.nc        # append the 2-d velocities
ncks -A v3d.nc boundary.nc        # append the 3-d velocities
ncks -A zeta.nc boundary.nc       # append the surface elevations

This just gives you a taste of what's possible with NCO. There is also an attribute editor (ncatted), tools for averaging across records or other dimensions (ncra, ncea, ncwa), and a renamer (ncrename). If you need to be modifying NetCDF files, it pays to be aware of what NCO can offer you.

P.S. Getting NCO to compile on a Mac requires undefining POSIX_SOURCE .


#
# Editor's Note: NCO is available on iceberg, iceflyer and nelchina.
#

Nelchina Operating System and Software Upgrades

On Wednesday January 17th, nelchina will be upgraded to the latest version of the Cray Linux Operating System, version 1.4.1. At the same time, Lustre filesystem and queuing system upgrades will be performed. Here's a list of things to keep in mind for the upgrade:

  • PBS will be upgraded to version 7.1.4c. Simple PBS scripts should continue to work without change. If you experience problems with existing batch scripts, please contact the ARSC Help Desk.
  • The $WORKDIR and $DATADIR filesystem will undergo firmware upgrades. While this, ideally, should not affect existing data in either of these directories, please recall that neither of these directories is backed up. If you have important data in either of these directories, we recommend that it be copied to long term storage (i.e. $ARCHIVE) prior to the upgrade.
  • If you have any binaries which were statically linked, Cray recommends that the programs be recompiled with the new libraries. In most cases your program shouldn't be statically linked unless you explictly compiled the program that way.

In preparation for this upgrade, the default version of the PGI compilers has been updated to version 6.1.1. Version 6.0 will no longer be supported following the upgrade.

Information Assurance Training: Spear Phishing

[[ Thanks to Derek Bastille of ARSC for this article. ]]

In their never ending quest to separate you from your hard-earned cash, criminals are turning to a new form of email social engineering. The technique is known as 'phishing' and involves trying to get you to log in to a web site that appears to be a legitimate business and then enter personal or financial information.

In addition to just harvesting your information, bogus phishing web sites will also frequently be infected with spyware, adware and a variety of other malicious programs. Thus, even if you do not actually enter any information into the site, you may still wind up with harmful software downloaded onto your system.

Phishers are becoming increasingly talented at faking both legitimate email and web sites -- frequently including bits and pieces of the real site to muddy things up. Thus, emails can appear quite genuine since they can include the actual company logos and links to the real site.

'Spear Phishing' is a new, even sneakier, technique that involves the sending of targeted email with information that is specific to the person targeted. For example, a phisher may come across a list of subscribers to Verizon and then tailor emails to these subscribers purporting to be from Verizon.

There are a few things that you can do to protect yourself from either Phishing or Spear Phishing. First thing is to remember that legitimate businesses will never ask you for personal or account information in an email. They do not need this information to 'verify' your account.

Secondly, don't be fooled by the false sense of urgency that is often implied by a phishing email. Contact the company yourself using either the telephone or a known good email address to check your account status.

Never just click on a link in an email wanting you to give any information. Always either use a bookmark or type the address into your browser yourself. Some newer browsers will check to be sure that the site you are going to is not a known phishing host, but the only way to really be sure is to enter the URL yourself.

Lastly, when in doubt about the contents of an email, don't be shy about viewing the message as raw text. You can usually spot a phishing attempt by the fact that the actual URL tag 'href' does not match where it says it is supposed to go.

According to several US government sources, identity theft is one of the fastest growing types of crime and phishing is a common way of getting your identity information. Thus, it is important to be aware of the places that you visit on the internet and to use common sense when viewing and dealing with your email. Remember, if it seems too good to be true -- it probably isn't.

Quick-Tip Q & A



A:[[When writing "for" loops in ksh, I really hate typing a sequence
  [[of numbers in.
  [[
  [[ e.g.
  [[ for num in 1 2 3 4 5 6 7 8 9 10 11 12; do
  [[     # do something with $num   
  [[ done
  [[
  [[Is there a way to specify a range of values to iterate through
  [[in ksh?  
  [[
  [[In C it's so easy.
  [[
  [[ e.g.
  [[ for(ii=1;ii<=12;++ii) 
  [[ {
  [[ /* do something */
  [[ }
  [[
  [[ There's got to be a better way to do this in ksh!
  [[



#
# Thanks to Derek Bastille for this solution using "while"
#

num=1; while (( num < 13)); do
  # do something with num
  let num=num+1  # note, no spaces!
done



#
# Thanks to Rich Griswald for a solution using the "seq" command
# 

In ZSH, you can use {n..m}, like this:

  for i in {1..10}; do echo $i; done

You can also use the seq command:

  for i in `seq 1 10`; do echo $i; done

You should be able to use something similar with ksh or other shells.

#
# Editor's Note: "seq" is available on ARSC Linux Systems, but 
# is not currently available on iceberg and iceflyer.
#


#
# Thanks to Kurt Carlson for this solution.
#

There are many ways to accomplish things in any shell, what is
better depends on both what you want to do and your style of coding.
To use ksh, knowing about 'integer' declarations (and typedef) will
help with many things:

#  integer I=0; while [ 3 -gt $I ]; do I=$I+1; echo I=$I; done
I=1
I=2
I=3

Note, for implementations which errantly link ksh to bash, bash
(and classic bourne sh) do not understand 'integer'.  Note, for
pdksh (some Linux) vs. ATT based ksh, 'while' loops are implemented
with a fork and the value of $I will remain 0 after the loop above
completes... that is one of many subtle bugs in pdksh, stick with an
ATT based or vendor supported implementation (avoid pdksh).  I suspect
bash must have a means to declare an integer variable, but having used
ksh since before bash was bourned I have never needed to look that up.


#
# Editor's solution using ksh 93 (or bash) 
# 

If you have a version of ksh which supports the 93 standard, you can
use this C-like syntax.  

for (( ii=0; ii < 13; ii=ii+1 )); do 
    echo $ii; 
done

On iceflyer and iceberg, the Korn Shell 93 version is cleverly called
"ksh93" while "ksh" follows the Korn Shell 88 standard.



Q: Too many times, I've wanted to clean out a few files with a command
like this:

    % rm 200612*.dat old *.txt  junk*

but was typing too fast and entered something like this,
    % rm 200612*.dat old * txt  junk*
    
thus deleting everything.

I absolutely do not have the patience to make a habit of "rm -i". I'd
just like to disable "rm *".  Is there a way to do this?  Any other
great ideas to keep me from deleting everything with the flub of
a finger?

[[ Answers, Questions, and Tips Graciously Accepted ]]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top