ARSC HPC Users' Newsletter 374, November 16, 2007

Reminder: No Holiday For Purged File Systems

All files over 30 days old in the $SCRATCH and $WORKDIR file systems at ARSC are purged nightly. So, before you take that month-long holiday, be sure you archive all files you'd like to keep.

For the purposes of purging (at ARSC), the age of a file is based on the time it was last *accessed*. To see the last access time for a file, use the command "ls -lu" ("ls -l" shows when the file was last modified). E.g.:


  % ls -l README
    -rw-------  1 baring staff 13866 2007-07-30 16:21 README
  % ls -lu README 
    -rw-------  1 baring staff 13866 2007-09-21 11:13 README
  % cat README > /dev/null
  % ls -l README          
    -rw-------  1 baring staff 13866 2007-07-30 16:21 README
  % ls -lu README         
    -rw-------  1 baring staff 13866 2007-11-14 14:47 README

ARSC provides a local tool, "getPurgable" [sic] which provides a report on your files eligible for purging, and is easier to use than "ls." "getPurgable -h" tells how to use the tool.

$SCRATCH, $WORKDIR, and /datadir are not Backed-Up:

Regardless of the age of your files, please remember that $SCRATCH and $WORKDIR are not only purged, but never backed up. Thus, you should regularly archive all important files on these file systems.

Read "news storage" for more information, and, if needed, contact User Support,

http://www.arsc.edu/support/support.html

for help.

Modern LaTeX

[ By: Kate Hedstrom ]

Some time ago, I wrote a user's guide in LaTeX, using it because it is free software and because it makes beautiful equations. Since that time, most of my friends have moved on to other software, claiming ease of use, but the equations in those other packages still don't measure up. Besides, those of us who program in Fortran for high performance aren't afraid of a compilation phase.

Anyway, the time has come to update that user's guide. Last time, the funding agency wanted 100 paper copies of the document, so I was free to use anything. This time they want a pdf file. There was a time when creating a pdf file that was viewable on a PC was quite tricky from LaTeX. Also, I'd like to be able to include graphics other than encapsulated postscript (eps) images in the thing. Is it time to modernize with some whole new software or can I get away with using LaTeX again? Another reason to prefer LaTeX is that we're creating a documentation wiki for the ocean model and the wiki can display LaTeX equations directly, generating a png image for each. Equations I write in the manual can be placed on the wiki with a simple cut and paste operation and still look great.

A little google searching brought up the existence of a pdflatex command. Rather than running latex to convert your source to a dvi file, then view that with dvips or xdvi, you run pdflatex on your source to create a pdf file directly. As an added bonus, it can directly handle images in png, jpeg, or pdf. For those old eps images lying around, there is an epstopdf command.

So, how about a little example. I've got an eps image that I've converted to pdf with:


   % epstopdf mask.eps
so that I now have mask.pdf. The size of it is 4.5 by 4 inches. An example LaTeX file that can be used to display it is:

\documentclass[11pt]{article}
\usepackage{graphicx}
\begin{document}

\begin{figure}[p]
  \setlength{\unitlength}{.25in}%
  \begin{picture}(18,16)(-3,0)%
    \includegraphics{mask.pdf}%
  \end{picture}%
  \caption{Small grid with masked regions}
  \label{fmask}
\end{figure}

\end{document}

Note that I'm using the graphicx package, which knows about pdf, png, and jpeg images. It can also handle eps images if you will be using the traditional mode of generating a dvi file.

Also note that I've got a picture environment nested inside the figure environment. The optional arguments to the picture environment tell latex how big the image is, here in units of quarter inches. The other arguments tell latex where the origin is, which can be used to center the thing (I use trial and error, plus a ruler for that).

To get a pdf file from this, simply type:


   % pdflatex test.tex
which creates a test.pdf file. evince is one way to view it on the screen.

References:

http://amath.colorado.edu/documentation/LaTeX/reference/figures.html http://www.artofproblemsolving.com/LaTeX/AoPS_L_PictMan.php

$SAMPLES_HOME - Sample Code Repository

ARSC has created Sample Code Repositories on midnight and iceberg. These are collections of frequently-used procedures, routines, scripts, and codes intended to help users find the best practices for various situations. Users are encouraged to review the samples and are welcome to copy the sample code and scripts and for their own use. On either system, you may type the command,   news samples_home to see an index of the repository and learn more about it.

The $SAMPLES_HOME environment variable is set to the root of the sample directory. You can use that environment variable to get to the samples, e.g.:


  iceberg1 26% cd $SAMPLES_HOME
  iceberg1 27% ls
  INDEX.txt            applications         dataManagement       debugging
  jobSubmission        libraries            parallelEnvironment

You are invited to submit samples for consideration as new entries in the repository. While we would prefer complete samples, we will definitely consider less complete examples if that is what you have. The Sample Code Repository is very much a work in progress at this point in time, but with your suggestions we hope to make it a useful resource.

The Sample Code Repository was a recommendation of the HPCMP Baseline Configuration Project:

http://www.afrl.hpc.mil/consolidated/bc/index.php

Book Reviews, "bash Cookbook" and "Fonts & Encodings"

[ By: Lee Higbie ]

I would like to recommend two new books from O'Reilly. The common thread with these books is they are sufficiently well written that they are fun to browse, not just to use as references.

bash Cookbook    Solutions and Examples for bash Users By: Carl Albing, JP Vossen, Cameron Newham First Edition May 2007 Pages: 622 $49.99US (from oreilly.com)
  • The Cookbook uses some cutesy kitchen lingo near the beginning, but fortunately it was a small, tolerable dose. The descriptions of the Cookbook's recipes are clear and include explanations of how and why the more obscure parts of the scripts work as they do. The Cookbook covers many parts of bash and was a useful way to learn some of the features that I'd missed. Each of the hundreds of the sections describes a "recipe" for solving a problem. Generally the book provides a solution, a discussion of the solution, and a few pertinent references such as man pages and other recipes. Typical of the sections is the one addressing the problem of searching compressed files. The book describes zgrep, zcat and gzcat which allow you to grep or display compressed files. In this case, ARSC doesn't support all of the utilities--gzcat is not on midnight or iceberg--but zcat will often type gzipped files. I hadn't known about these z* tools.

    I found only a few places where the authors missed solutions or gotchas.

    1. In the description of using the command history to save typing, they omit one technique I use frequently: open another window and use mouse copy/paste to copy commands or command string portions to the second window or from elsewhere in the current window.
    2. On midnight inserting a blank at the beginning of a command string means it is no longer retained in the history buffer. In many places they note that different versions of bash do different things.
    3. As with all technical books, I would like to see more non-standard terminology in the index. For example, someone coming from a non-unix background could benefit from index references for "type" and "print" redirecting them to "cat," "zcat," "more," and "less."
    4. The only major omission I noticed was a table of how various versions of bash differ, something along the lines of the Differing Features table in "Unix in a Nutshell."
Fonts & Encodings By Yannis Haralambous Translated by P. Scott Horne First Edition September 2007 Pages: 1037 $59.99US (from oreilly.com)
  • The continuing growth of Unicode, now it's up to over 100,000 glyphs, has fascinated me so I enjoyed Haralambous's description of the history of unicode, letters and type. I also enjoyed the descriptions of renderings, where Haralambous describes the various forms of "hinting" that are used so that letters look nice.

    Typical of the Haralambous's interesting style is his discussion of what constitutes a letter. The word "pectopah" is familiar to most Americans who have visited Russia. (Write it in all caps then read it as though it is written in Cyrillic, which it actually is.) His point: the glyph H is identical in English, Greek and Russian, even though the lower case letters differ, but the meaning of the characters differs in all cases.

    He has a nice example of the rendering problem. With a fixed pixel grid he shows how H would print as the size varies. The resulting letters are quite different even though the five examples only vary in size by about 15%. With the easiest rasterization, the crossbar and vertical strokes jump back and forth from one to two pixels.

    Two anomalies in the tome struck me. First is the use of original drawings, with annotations in French. I think many readers would like to see translations. Perhaps these could have been included in captions. Also, some of the old typefaces are so hard to read, I think it would help many of us to see transliterations (into a modern typeface) of the Latin, French and even English phrases, just as it would be interesting to transliterate Cyrillic and Greek in some places. Second is the division of the index into two parts. Most American readers, at least, are accustomed to a single inclusive index. The index is a weak part of most computer books. This separation does not help.

    Trivia from Fonts & Encodings:

    1. Unicode currently has 90,989 "letters." There are also numerals, punctuation, symbols, math symbols, and many more to take it to the "more than 100,000" cited above.
    2. It uses 21 bits, though there are many language-specific foldings to reduce character representations to a single byte.
    3. In principle, Unicode can never eliminate a character. All ancient written languages' alphabets or ideographs should be included. So far as I could see, Egyptian hieroglyphics, Mayan writing and Greek Linear A are not yet present. Haralambous does not mention Mayan writing. He does mention that hieroglyphics are under consideration for addition to Unicode and Greek Linear A will, presumably, be considered for addition (once it is deciphered).

--

Editor's note:

As an aside, it looks like O'Reilly book covers can't keep up with the falling US dollar. Glancing at the "Fonts" book, for example, we see the list price as:

  • $71.99 CAN
  • $59.99 US
Implying an exchange rate of,
  • $0.83 US to $1.00 CAN
while the current exchange rate (from finance.yahoo.com, on November 15th at 14:25) is actually:
  • $1.02 US to $1.00 CAN

So if our friends East of the border in Whitehorse, Yukon Territory, wanted to visit us in Fairbanks, they could pick this book up for effectively:

  • $58.81 CAN

Saving themselves $13.18--enough for 4 gallons of gas and a pickle!

Santa Letters, Postmarked "North Pole"

Uncles, Aunts, Parents, Teachers, and other friends of kids: once again, your editors would be honored to play Santa's helper.

The town of North Pole, Alaska is a mere 15 miles from here. If you'd like a letter, postmarked "North Pole," delivered to someone, seal your letter in a stamped, addressed envelope, and instead of mailing it from your local post office, enclose it in a larger envelope and send this to us. On about December 12th we'll mail them from North Pole.

Send to:

Tom Baring and Don Bahls Arctic Region Supercomputing Center University of Alaska Fairbanks P.O. Box 756020 Fairbanks, AK 99775-6020

Plan extra time for mail up and back from Alaska... If you post these to us on or before Dec. 5th, there should be no problem.

Quick-Tip Q & A


A:[[ My application creates a number of postscript files that I need
  [[ to convert to the png image format so I can put them on my webpage.
  [[ Currently I open each file in an image editor and save the file
  [[ to the new name.  This seems like a waste of my time!  Is there
  [[ a way to automate this process?


#
# Thanks to Greg Newby for this suggestion.
#
ps2png is part of the tth package:
  
http://hutchinson.belmont.ma.us/tth/


It has some prerequisites, but I tried it on a non-ARSC system and got
it working quickly.  It seemed to work fairly well (about as well as
other methods for getting images from .ps files).


#
# Ed Bueler suggested using the following technique.
#
One can use ghostscript; here for converting foo.eps into foo.png:
  gs -dSAFER -dBATCH -dNOPAUSE -dEPSCrop -sDEVICE=png16m -r300 \
           -sOutputFile=foo.png foo.eps
I have a makefile that includes a bunch of these kinds of conversions in
order to create a User's Manual in PDF format.  This page has lots of
advice on using ghostscript, including a description of the
flags/options used above:
  
http://pages.cs.wisc.edu/~ghost/doc/cvs/Use.htm



#
# Thanks to Rich Griswold for this suggestion.
#
The Netpbm "pstopnm" utility will convert postscript to pnm.    One file
will be created for each page in the postscript file (for example
running pstopnm on file.ps with three pages creates file001.ppm,
file002.ppm, and file003.ppm).  From there, you can use pnmtopng to
create the png files.  Here's a simple zsh script to convert a file:

  #!/bin/zsh
  for i in $@; do
    pstopnm $i
    for j in $i:r*.ppm; do
      pnmtopng $j > $j:r.png
    done
  done


#
# Jed Brown, Sean Ziegeler, Martin Luethi, and Chris Swingley suggested
# using ImageMagick.  Here's Chris Swingley's response:
#
Indeed there is!  The ImageMagick suite of programs will do this task
very quickly and easily.  Something like this (in bash):

    $ for i in `ls *.ps`; do
         convert -density 150 $i ${i%%.*}.png;
      done

I've found that using the '-density [dpi]' option to convert yields the
best looking images when converting a vector format (like PostScript)
to raster (like PNG).  You'll need to tweak this value to get the
final image size you want.  The strange looking '${i%%.*}.png' snippet
replaces the ps at the end of the files with png which informs the
convert command what format you're looking for as output.

One other hint with shell loops like this: put 'echo' before the
command you're running ('echo convert . . .' in this case) the first
time you try the loop.  This gives you a chance to see how the shell
has interpreted your code, before actually running it.  If the result
looks like a set of valid commands, just run the command again, but
without the 'echo'.


#
# And here's a second suggestion from Jed Brown.
#
For some things, I find the gimp does a better job so I have written
some scripts that do batch resizing and application of filters.  If you
don't know scheme, the learning curve might be steep with the gimp.





Q: [ Thanks to Derek Bastille for sending this. We've simplified 
     and restated the problem to protect the innocent...]

  I have a file which contains a long list of file names, one per line.
  Every file name takes the format of a time-stamp followed by a name,
  like this: YYYYMMDD.HHMM.<name>, and every file name is prefixed by
  a full Unix path.  Here's a sampling of the file:

     /wrkdir/gizmo/STREAM_BENCHMARK/20071010.2322.stdout.txt
     /archive/u1/uaf/gizmo/STREAM_BENCHMARK/CRAY-T3D/19940911.1023.stdout.txt
     /archive/u1/uaf/gizmo/STREAM_BENCHMARK/CRAY-T3D/19940911.1023.stdout.txt
     /u1/uaf/cranky/progs/nas/20050606.0730.run1.out
     /u1/uaf/cranky/progs/nas/20050606.0838.run2.out
     /scratch/stuff/20071106.1803.run1.out

  I need to sort the lines in this file, using the time-stamp as the
  sort key.  
  
  The character position of the time-stamp varies from line to line,
  as does the number of forward slash delimiters.  Is there a way to
  do this with the Unix "sort" command, or am I stuck writing a script?

[[ Answers, Questions, and Tips Graciously Accepted ]]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top