Several environment variables are defined on each ARSC system to provide convenient access to the available storage directories. We recommend you use these environment variables in batch scripts and on the command line to improve portability of your work within the ARSC environment. Because the variables have identical names and refer to the same types of storage on every system, it is possible to move scripts among ARSC systems with minimal changes by using the storage environment variables rather than explicit paths.
home directories$HOME |
Purpose:
Files in this directory are regularly backed up and never purged. |
archive directories |
Purpose:
Long-term storage for user files (e.g. source code, input/output data,
etc.) |
$ARCHIVE_HOST |
Purpose:
|
work directories$WORKDIR |
Purpose:
Local workspace for data needed during a run, visible from all nodes of a
platform. |
scratch directories$SCRATCH |
Purpose:
(single-node machines) Same as $WORKDIR.Files in this directory are not backed up and are subject to purging after 30 days. |
Home directories are intended to contain files used for customizing
your basic working environment only. Your home directory path is stored in
the
$HOME environment variable on every system. Home directories
are unique to each platform and are routinely backed up. Small data files
specific to a
particular system belong
here
such as
account configuration .cshrc/.login or .profile files. In many cases, if
an executable compiled for a system is small enough, $HOME would
be a logical place to store the executable.
Home directories are limited in size, should you require a larger $HOME quota,
please contact User Support with your
request.
Work directories are typically mounted locally
or over a fast network connection. Following the "local cache" concept,
doing your work in $WORKDIR instead of $ARCHIVE_HOME (where
disk space is NFS mounted from a remote machine) will allow for faster I/O
and save network bandwidth. $WORKDIR is available on all
systems and always refers to the same style of storage space: a fast
file system addressable by all nodes on that system.
Files in $WORKDIR or $SCRATCH that have not been
accessed in over 30 days are subject to purging (i.e. removal). In rare
circumstances purging may occur in less than 30 days should a filesystem
become full, etc. Purging is performed
in order to maintain adequate free space in the high-performance filesystem
for all users to effectively share the space. The getPurgable
utility can be used to identify files that are eligible for purging.
As work directories are not backed up, please copy data that needs to be backed up to archive storage as soon as possible.
Larger $HOME and $WORKDIR quota requests may be sent to User Support.
For more details on local storage configurations please see 'news storage' on the system you are using:
Archive directories are intended for data to be stored long term. Files in this directory are automatically backed up to tape after several hours. This filesystem provides the slowest access overall, however no quotas are set on this filesystem. See the Long Term Storage Introduction and Best Practices for a more information.
The getPurgable utility is available on compute systems at ARSC. This tool can be used to identify files which are or will be eligible for purging in a given period of time.
Usage:
getPurgable
Examples:
List all files in$WORKDIRthat will be purged the next time the purger runs. The file purger is typically run daily.
midnight % getPurgable
List files which will become eligible for purging in$WORKDIRof the next 10 days.
midnight % getPurgable -n 10
List files eligible for purging today in an alternate location. Note you must have access to all files and directories in the path specified to list a complete report.
midnight % getPurgable -p /wrkdir/usera
Nanook and Seawolf are Sun Fire 6800s providing long term, backed up data storage for ARSC resources. Nanook serves the /archive filesystem for the linux workstations. Seawolf serves the same function for iceberg and midnight. Both servers utilize SAM-QFS for managing user files stored in the corresponding archive file system. SAM-QFS consists of "online" disk storage, "offline" tape storage, and a set of daemons managing file status.
When a file is saved to the archive filesystem, the file is initially "online". Shortly thereafter, two copies of the file are automatically made on separate tapes. The file will remain "online" if sufficient disk space is available, otherwise the file will be taken "offline" and will be removed from disk storage, leaving two copies on tape. If the file is "online", it is immediately accessible to the user. If the file is "offline", then the user can request a copy of the file to be brought "online" by the SAM-QFS system. Please read more on the stage and batch_stage commands to request files to be brought back "online". The staging process is highly encouraged, especially when potentially working with thousands of "offline" files.
Users of ARSC resources have an account on either nanook, seawolf, or both
in some cases. Having access to a system served by one of the
file servers implies you will have access to the corresponding file server
in order to manage your files efficiently. Display the value of the $ARCHIVE_HOST environment
variable to tell which server is hosting the archive file system for the machine
you are logged into.
The SAM-QFS commands listed below can be used to manage data from either nanook or seawolf.
The following commands must be executed on the archive host (i.e. $ARCHIVE_HOST)
where your files are stored. The host will be nanook for linux workstations
and seawolf for HPC resources. For more information, please read the
man page for the particular command (e.g. man sls).
archive
The "archive" command issues a request for a file, or group of files matching wildcards (when * or ? are specified), to be copied to tape. Please note that the return of this command does not mean the copy to tape has finished. Check for file transfer completion with the sfind or sls commands. Also, disk space is not released unless the release command is executed.
SAM-QFS will archive files automatically. This normally occurs within approximately an hour of creation or modification.
Usage:
archive filename
release
After a file has been archived, three copies exist -- two on tape and one on disk. The "release" command will remove the copy of the "online" file if the "offline" (tape) copies have already been made. A directory listing will remain for the now "offline" file. To use the file after issuing the "release" command, the file will need to be copied from tape back to disk (e.g. manually by stage.)
SAM-QFS may release files automatically over time depending on the level of activity on the file system.
Usage:
release filename
sfind
The "sfind" command finds files with the requested attributes. Searchable attributes include:
- -offline (file copied to tape and disk space released)
- -online (copy exists on disk)
- -archdone (all archive/stage steps completed)
- -copies 2 (both copies exist on tape)
In most cases, "sfind" uses the same keywords as sls -DK.
Usage:
sfind -offlineExamples:
Search for offline files with
$ARCHIVE_HOME:
seawolf % sfind $ARCHIVE_HOME -offline
Search for files that have two copies on tape:
seawolf % sfind $ARCHIVE_HOME -copies 2
stage
The "stage" command initiates a request to copy an "offline" (tape copy only) file to be placed "online" (on disk). Only online files may be read, copied, etc. Giving advanced notice that the file will be needed is a way to ensure the file is "online" and ready to use when requested. Otherwise, an attempt to read the file will automatically stage the file, although it may take some time for the tape to mount and a new "online" copy of the file to be placed on disk.
Please note that the return of this command does not mean the copy to disk has finished. Check for "online" file completion with the sfind or sls commands.
When staging large numbers of files, the batch_stage command may perform more optimally. Therefore, consider using "batch_stage" over the "stage" command in this situation.
Usage:
stage filenameExamples:
The following sfind command will find all of your f90 files and stage them to disk, then will echo the filenames to the terminal:
seawolf % sfind . -name "*.f90" -offline -exec stage {} \; -exec echo {} \;
batch_stage
The "batch_stage" command brings a set of files "online". Files are staged in the order they are written to tape. This minimizes tape seeks and typically reduces the amount of time it takes to bring multiple files back "online".
Usage:
batch_stage filenamesExamples:
To stage several files by name, simply list all the files that need to be staged following the "batch_stage" command:
nanook % batch_stage fileone filetwo filethreeWildcards may also be used. The following command will stage all the files in the $ARCHIVE_HOME/data directory:
seawolf % batch_stage $ARCHIVE_HOME/data/*A list of files to be staged can also be supplied through stdin, making "batch_stage" ideal to use in conjunction with the "find" command:
nanook % find $ARCHIVE_HOME/somedirectory/ -name \*.nc | batch_stage -i
sls
The "sls" command is an extended version of the "ls" command which lists files including their SAM-QFS attributes. The -D option will show most of the SAM-QFS attributes.
Usage:
sls filenameExamples:
Show detailed description of SAM information for all files in
$ARCHIVE_HOME:
seawolf % sls -D $ARCHIVE_HOMEShow two lines of output with SAM information:
seawolf % sls -2 $ARCHIVE_HOME/myfile
sdu
The "sdu" command is a SAM version of "du". The command reports the sum of offline and online disk usage.Usage:
sdu directoryExamples:
Show summary (-s) of usage for each subdirectory in
$ARCHIVE_HOMEin kilobytes (-k):
seawolf % sdu -sk $ARCHIVE_HOME/*Show summary (-s) of usage for
$ARCHIVE_HOMEdirectory in human readable form (-h):
seawolf % sdu -sh $ARCHIVE_HOME
The following is a list of practices which can improve the performance and usability of long term storage at ARSC.
$ARCHIVE_HOME can drastically degrade the performance
of the long term storage daemons. When possible save related files into a single
tar file.
$ARCHIVE_HOME When copying more than one
terabyte of data into
$ARCHIVE_HOME, limit the number of streams (e.g. cp's) to one.
This will allow the archiving daemon to keep up with the creation of tape copies
while
leaving tape drives available for other users.$ARCHIVE_HOME When transferring large
numbers of files from
$ARCHIVE_HOME to $WORKDIR or a remote location, staging
the files using the
batch_stage prior to the transfer may drastically
reduce
the time required to complete the transfer.
$ARCHIVE_HOME,
it is most effective to do this directly on the $ARCHIVE_HOST (i.e.
seawolf or nanook). The
batch_stage command should be used to bring all
files online prior to issuing the tar command.
$ARCHIVE_HOME as you work it is best to make copies
of the files to $ARCHIVE_HOME as they are created, rather than
saving the work up and copying them all at once. The archive system works
better with regular smaller volumes of new data than it does with occasional
floods
of new data. This practice will also minimize the impact of a catastrophic
failure of the temporary filesystem(s) on the HPC system.
sum -r', 'md5', etc.) can be used to
check integrity of the copy and source. It is best to use these utilities
on the system where the file resides in
order to avoid NFS related issue (e.g. cache effects).
Arctic Region Supercomputing Center
PO Box 756020, Fairbanks, AK 99775 | voice: 907-450-8600 | email:
home | search | about | support | news | science | resources