Job Submission on Fish

Running Interactive Jobs

Users are encouraged to submit jobs to the Torque/Moab batch scheduler, but spawning interactive jobs is also available.  To use the batch system for spawning an interactive job, type the following commands:

qsub -q standard -l nodes=2:ppn=12 -I

aprun -n 4 ./hello.exe

The aprun command is necessary to execute the job on the compute nodes.

Standard output and standard error are displayed within the terminal, redirected to a file, or piped to another command using appropriate Unix shell syntax.

Running Batch Jobs

Production work is run through the Torque/Moab batch scheduler.  Jobs run through the batch scheduler execute the programs on the compute nodes. The scheduler assigns a unique jobid to each individual job submission, conveniently saves stdout/stderr to a file(s) for each run, and allows jobs to run after the user logs off.

A batch job is a shell script prefaced by a statement of resource requirements and instructions which Torque/Moab will use to manage the job.

Torque/Moab scripts are submitted for processing with the command, qsub.

As outlined below, five steps are common in the most basic batch processing:

1. Create a batch script:

Normally, users embed all Torque/Moab request options within a batch script.

In a batch script comforming to Torque/Moab syntax, all Torque/Moab options must precede all shell commands. Each line containing a Torque/Moab option must commence with the character string, " #PBS"   followed by one or more spaces, followed by the option.

Torque/Moab scripts begin execution from your home directory and thus, the first executable shell script command is a usually a " cd " to the work or run directory. The environment variable PBS_O_WORKDIR is set to the directory from which the script was submitted.

Torque Script: MPI Example requesting 4 CPU-only nodes (all 12 processors on each node)

Script Notes
#!/bin/bash
#PBS -q standard

#PBS -l walltime=8:00:00 

#PBS -l nodes=4:ppn=12

#PBS -j oe

cd $PBS_O_WORKDIR

NP=$(( $PBS_NUM_NODES * $PBS_NUM_PPN ))
aprun -n $NP ./myprog
Select the shell.
The -q option requests to run the job 
in the standard queue
Walltime requests the job be allowed 
to run for a maximum of 8 hours.  
Requests 4 nodes and all 12 processes on 
each node.  
The -j option joins output and error messages 
into one file.
Change to the initial working directory 
($PBS_O_WORKDIR).
Determine total number of processors to utilize.
Execute the program by calling aprun and passing 
the executable name.

Torque Script: OpenMP Example requesting 1 CPU-only node (all 12 processors on the node)

Script Notes
#!/bin/bash
#PBS -q standard

#PBS -l walltime=8:00:00 

#PBS -l nodes=1:ppn=12

#PBS -j oe

cd $PBS_O_WORKDIR
export OMP_NUM_THREADS=12

aprun -n 1 -d 12 ./myprog
Select the shell.
The -q option requests to run the job 
in the standard queue
Walltime requests the job be allowed 
to run for a maximum of 8 hours.  
Requests 1 node, all 12 processes on 
the node.  
The -j option joins output and error messages 
into one file.
Change to the initial working directory ($PBS_O_WORKDIR).
Set the environment variable for the max number of threads.

Execute the program by calling aprun and passing 
the executable name.

Torque Script: MPI and OpenMP Example using 4 GPU nodes

Script Notes
#!/bin/bash
#PBS -q gpu

#PBS -l walltime=8:00:00 

#PBS -l nodes=4:ppn=16

#PBS -j oe

cd $PBS_O_WORKDIR

export OMP_NUM_THREADS=$PBS_NUM_PPN
NP=$PBS_NUM_NODES

aprun -n $NP -d ${OMP_NUM_THREADS} ./myprog
Select the shell.
The -q option requests to run the job 
in the standard queue
Walltime requests the job be allowed 
to run for a maximum of 8 hours.  
Requests 4 nodes and all 12 processes on 
each node.  
The -j option joins output and error messages 
into one file.
Change to the initial working directory 
($PBS_O_WORKDIR).
Set the number of max threads for OpenMP. 
Determine total number of MPI tasks to utilize 
(one per node when accessing GPU accelerators).
Execute the program by calling aprun and passing 
the executable name.

2. Submit a batch script to Moab/Torque, using qsub

The script file can be given any name, and should be submitted with the "qsub" command.  For example, if the script is named myprog.pbs, it would be submitted for processing with the following:

qsub myprog.pbs

3. Monitor the job

To check the status of the submitted Torque/Moab job, execute this command:

qstat -a

4. Delete the job

Given its Torque/Moab identification number (returned when you " qsub the job and shown in " qstat -a output), you can delete the job from the batch system with this command:

qdel <PBS-ID>

5. Examine Output

When the job completes, Torque/Moab will save the stdout and stderr from the job to a file in the directory from which it was submitted. These files will be named using the script name and Torque/Moab identification number. For example,

myprog.pbs.o<PBS-ID>

Torque/Moab Queues

List all available queues with the command qstat -Q . List details on any queue, for instance, "standard " with the command qstat -Qf standard . You may also read news queues for information on all queues, but note that the most current information is always available using the qstat commands.

Charging Allocation Hours to an Alternate Project

Users with membership in more than one project should select which project to charge allocation hours towards. The directive for selecting which project is the "-W group_list" Torque/Moab option. If the "-W group_list" option is not specified within a user's Torque/Moab script, the account charged will default to the user's primary group (i.e. project).

The following is an example "-W group_list" statement.

#PBS -W group_list=proja

The " -W group_list"   option can also be used on the command line, e.g.

fish1 % qsub -W group_list=proja script.bat

Each project has a corresponding UNIX group, therefore the "groups" command will show all projects (or groups) of which you are a member.

fish1 % groups proja projb

Without the "-W group_list" Torque/Moab option, allocation hours would be charged to proja by default, but could be charged to projb by setting " -W group_list=projb"   in the Torque/Moab script.

Monitoring Project Allocation Usage

Active projects are allocation cpu time on the system.  The "show_usage" command can be used to monitor allocation use for projects.

fish1 % show_usage 

            ARSC - Subproject Usage Information (in CPU Hours)
                     As of 04:24:01 hours ADT 24 May 2012
          For Fiscal Year 2012 (01 October 2011 - 30 September 2012)
                  Percentage of Fiscal Year Remaining: 35.62% 

                          Hours      Hours      Hours      Percent  Background
System     Subproject     Allocated  Used       Remaining  Remaining Hours Used
========== ============== ========== ========== ========== ========= ==========
fish       proja            20000.00       0.00   20000.00   100.00%       0.00
fish       projb           300000.00  195887.32  104112.68    34.70%   11847.78
Back to Top