ARSC HPC Users' Newsletter Issue 435 2014-08-20

ARSC HPC Users' Newsletter Issue 435 2014-08-20

A publication of the Arctic Region Supercomputing Center.

1 Improving Job Priority On Pacman

It has been a busy summer for pacman.arsc.edu, our Penguin Computing cluster, and so far all signs are showing a steady demand during the Fall semester. Under these conditions, it can be helpful to understand a few things about job priority.

The pacman system runs a MOAB/Torque job scheduling system, which attaches a priority number to every job waiting in a queue. When compute resources become available, the job with the highest priority number is started first. In our local MOAB configuration, there are three dominant factors affecting a job's priority assignment: time elapsed in the queue, number of processors requested, and amount of walltime requested.

1.1 Queue Time

The priority number of a job increments every minute that job waits in the queue. In other words, if your job has been in the queue for a long time, that time has not been wasted, but rather the job has accumulated a valuable priority resource. A job left in the queue long enough will eventually accumulate a large number of priority points which can allow it to trump newly submitted jobs which would otherwise take higher priority due to other scheduling factors.

It can be tempting to abort a job prematurely to try different queue or resource parameters. Sometimes this makes sense, but remember that resubmitting your job may cost you a lot of accumulated priority.

1.2 Processor Weight

A large priority weight is given to the total processor (core) count requested by the job. Basically, the more processors you request, the more of a priority bonus you get "out of the gate".

This non-intuitive strategy is followed to achieve better long-term efficiency. The idea is that if we put larger jobs in first, we can run smaller jobs using the resources left over. By way of illustration: If you want to put marbles and sand into the same glass, you should put the marbles in first. Then the sand can fill the spaces in between the marbles.

Now, an increased processor count is a double-edged sword. Priority bonus or not, it will still be harder for the scheduler to find the nodes required to run the larger job. On the other hand, your job will have a priority advantage once such resources become available.

1.3 Wall Time

The current pacman configuration also applies a large weight to the Expansion Factor. The expansion factor is basically the same as a wait time priority bonus, except that whatever advantage you get is divided by your requested walltime. To put it more simply: your priority number will grow faster if you request a smaller walltime.

In view of the priority bonuses from processor weight and expansion factor: If your program can make efficient use of the extra compute resources, it may be beneficial to increase the number of cores requested by your job so that you can reduce the requested walltime proportionally.

2 Job E-mail Notification

2.1 Qsub Flags

When submitting a job through qsub, you can opt to receive various e-mail notifications regarding job status. The simplest way is to pass the -m flag, followed by a space and various letters representing the notifications you wish to receive.

letter e-mail is sent when the job…
a is aborted
b begins execution
e terminates

So, the command qsub -m abe myscript.pbs will cause an e-mail to be sent when the job starts, when the job finishes, and also if the job dies abnormally.

2.2 PBS Directives

The same flag can be used inside the PBS script through a directive. For example:

#PBS -m abe

2.3 E-mail Delivery

To ensure e-mails are sent to the correct address, you can either put the appropriate address into your $HOME/.forward file, or use qsub's -M flag. For example,

qsub -M me@example.com -m abe my_script.pbs

or, inside the script, use

#PBS -m abe
#PBS -M me@example.com

3 SC14 Registration And Schedule

Attendance registration for SC14 is now open, along with the SC14 hotel reservation system. The conference takes place at New Orleans, LA, November 16-21. According to the registration page, there is a savings of up to $625 if you register by October 15.

Also, be sure to check out the SC14 schedule page, which has an hour-by-hour event listing for the whole week. Google calendar links are available next to each event in case you want to start marking out the interesting ones.

4 Summary: Julia Lecture

For those who could not be there, we want to give an overview of Alan Edelman's lecture, which was primarily advocacy and discussion of the ambitious new programming language Julia, as an alluring new option for technical and parallel programming.

Edelman rejects the commonly held notion that programming languages cannot be both high-level and high-performance. He believes that Julia's advanced type system, multiple dispatch capabilities, and thorough code abstraction make both possible.

Using an IJulia notebook, Edelman gave live code demonstrations of plotting, matrix manipulation, parallel programming, and - his favorite - Roman numeral mathematics. In closing, he discussed a number of interesting projects being worked on using Julia and challenged us to "play around" with the language.

Edelman recommended the videos and tutorials from julialang.org. His favorite is Kaminski's Julia Express, which "introduces programmers to Julia programming by example."

5 Core Skills For Computational Science

Tom Logan's class "Physics F608: Core Skills For Computational Science" will be held, beginning September 4, on Tuesdays and Thursdays from 9:15am to 11:15am, in our WRRB 009 Classroom. By special arrangement, the lectures may be attended (without credit) by any interested individuals without registering or enrolling in the class.

From the PHYS 608 course description:

This course provides students of computational sciences, an introduction to the basic skills required to operate in the modern high performance computing (HPC) environment offered at the Arctic Regional Supercomputing Center (ARSC). Topics include an introduction to HPC, basic Unix/batch/scripting skills, performance programming, shared and distributed memory parallelism, code validation and debugging, data storage and management and data visualization. Each of these topics will be presented in lecture form. To provide additional applied knowledge, either a thorough case study by a guest speaker and/or a hands-on lab session will be given in support of each.

Access to our Mac Pro lab computers requires UA affiliation.

6 More Information

6.1 Editor

Christopher Howard mailto:cmhoward2@alaska.edu

6.2 Credits

Oralee Nudson, ARSC Lead User Consultant. Reviewer and insider source for ARSC news and tips.

Appreciation also goes to Liam Forbes and Bob Torgerson for feedback on draft content, and also Tim Slauson for sharing his notes from the Julia lecture.

6.3 Publication Schedule

The newsletter is usually released on the third Wednesday of each month.

6.5 Archived Newsletters

6.6 Questions, Comments, And Submissions

mailto:owner-hpc_users@arsc.edu

Do you want to find out what our readers know about a particular subject? Submit a question about HPC or ARSC software, and we will feature it in a Q&A section in the newsletter.

Back to Top