ARSC T3E Users' Newsletter 168, May 14, 1999
1-PE Jobs
Each T3E processor is configured to be one of the types:- CMD: command PEs, for user shells and single PE processes.
- APP: application PEs, for parallel applications
- OS: operating system PEs, for the operating system
Under UNICOS/mk 2.0.3 (the current OS on yukon) and prior versions, all 1-PE jobs run on CMD PEs, not on APP PEs.
If you request 1 PE from NQS, however, NQS treats the job like a parallel application (even though the OS runs the job on a CMD PE). Thus, NQS believes that one less APP PE is available, which may lead to another users' job being unnecessarily blocked.
At ARSC, you should use the "single" queue, and not the "small" queue, to run 1-PE jobs. This is a special queue to get around the 1-PE problem.
Using "single" prevents NQS from mistakenly crediting an APP PE to your job. It enables you to run two or more 1-PE jobs simultaneously, and it keeps your jobs out of contention with jobs in the "small" queue.
From "news queues" on yukon, here's how to use "single":
> To route a job to the "single" queue, do not specify a PE or > runtime limit. (If either limit is specified, the job will be > routed to another queue.) For example, given this specification: > > #QSUB -q mpp > > NQS would route the request to the "single" queue.Why all the confusion? NQS, the OS, and the applications themselves play different roles, and don't always cooperate.
NQS keeps its own tally of how many PEs it has assigned and how many are available. For instance, on a system with the NQS "global PE limit" set to 128, NQS might "release" a 128-PE request. It would then calculate that 0 PEs remain available. Once "released," the action of the request is handled by the OS, and not NQS.
In one scenario, the job (qsub script) might sit and compile Fortran code for 30 minutes, thus idling 128 PEs, but NQS wouldn't know about it, and wouldn't be able to assign other waiting jobs to those PEs. NQS must assume that all requested PEs are in use.
On the other hand, if an interactive user launched a 2-PE job while the 128-PE NQS request was compiling, the OS would indeed notice the available processors and start the 2-PE job. Unfortunately, the 128-PE request would then be blocked by the 2-PE job when it finished compiling, and NQS wouldn't be able to run anything else because, to its knowledge, 128-PEs would be in use.
In another scenario, a 1-PE request might appear in the "small" queue. When NQS releases a 1-PE "small" request, it subtracts 1 from its pool of available APP PEs, even though the job runs on a CMD PE, and thus under-counts the number of available APP PEs. This can again lead to jobs being blocked.
This problem with 1-PE jobs occurs from time-to-time on yukon, which has a global MPP PE limit of 256. Here's an example, as shown by the "qstat -m" command:
yukon$ qstat -m
----------------------------------
NQS 3.3.0.5 BATCH QUEUE MPP LIMITS
----------------------------------
QUEUE NAME RUN QUEUE-PE'S R-PE'S R-TIME P-TIME
LIM/CNT LIM/CNT LIMIT LIMIT LIMIT
----------------------- --- --- ------ ------ ------ ------ ------
System 6/0 --/0 -- -- --
gcp_grand 1/0 256/0 256 57600 57600
gcp_xxlarge 3/0 200/0 160 28800 28800
Qxxlarge 1/0 160/0 160 1800 1800
xxlarge 1/0 160/0 160 14400 14400
grand 1/0 256/0 256 14400 14400
single 10/1 0/0 1 -- --
Qxlarge 1/0 100/0 100 1800 1800
Qlarge 3/0 120/0 50 1800 1800
Qmedium 4/0 60/0 20 1800 1800
Qsmall 4/0 30/0 10 1800 1800
Qgrand 1/0 256/0 256 1800 1800
medium 4/1 60/18 20 28800 28800
large 4/1 200/50 50 28800 28800
small 4/1 30/1 10 28800 28800
xlarge 2/1 132/60 100 14400 14400
----------------------- --- --- ------ ------ ------ ------ ------
yukon 100/5 256/129
----------------------- --- --- ------ ------ ------ ------ ------
In this table, the column:
QUEUE-PE'S CNTshows NQS's count of the APP PEs in use, listing them by queue and for the entire machine. There were three actual parallel applications running, with the sizes 18, 50, and 60, for a total of 128 PEs.
Examine the row:
small 4/1 30/1 10 28800 28800which shows how NQS mistakenly counted 1 PE used in the "small" queue, and thus obtained a total of 129 total PEs in use. At this point, NQS refused to launch a waiting 128-PE request that could have run. From NQS's point of view, this would have consumed a total of 257 PEs, exceeding the global limit. This is apparent in the row for the overall totals:
yukon 100/5 256/129(The situation is resolved fairly easily by a sysadmin, but only when someone is on duty to notice it.)
The column:
RUN CNTshows the count of jobs running, by queue. Note that there was one job in the single queue. Following the row:
single 10/1 0/0 1 -- --to the QUEUE-PE'S/CNT column, note that this request did not count against NQS's PE total. This is the correct behavior for the "single" queue.
Run 1-PE jobs in "single"!
Debugging Debugging With FLUSH?
The print statement is an ever-popular debugging tool.This week, a user was trying to diagnose a Fortran code. It launched Okay, but immediately hung. He inserted this,
write(*,*)'BEGIN'
as the first executable statement in the program, recompiled, and ran, and again, it hung. It never even printed "BEGIN".
The solution was to debug the debugging statement by changing it to this:
write(*,*)'BEGIN'
CALL FLUSH (101)
The write statement had executed successfully the first time, but the write buffer had not filled up before the problem which caused the "hang" was reached, and thus, the "BEGIN" had never been "flushed" to the user's console. "FLUSH" forces the contents of a write buffer out to the specified unit number (101 is used for standard output), even if the buffer is not yet full.
Quick-Tip Q & A
A:{{ You're not sure if you compiled with Apprentice, PAT, or VAMPIR
enabled in your current executable. How can you find out? }}
Two answers. The first didn't work for VAMPIR and in either case,
you may have to scrutinize the output for hints:
what a.out
egrep -i "apprentice
vampir
pat"
strings a.out
egrep -i "apprentice
vampir
pat"
Three examples:
yukon$ what a.out.1
egrep -i "apprentice
vampir
pat"
apprentice/Lib/apprif.c 30.0 11/20/97 14:50:55
apprentice/Lib/cal.s 20.3 05/22/97 12:27:01
apprentice/Lib/comm.c 30.0 11/20/97 14:50:55
yukon$ strings a.out.1
egrep -i "apprentice
vampir
pat"
head -5
@(#)apprentice/Lib/apprif.c
WARNING: The Apprentice Runtime Information File (RIF) is being written
WARNING FROM APPRENTICE INSTRUMENTATION
barrier that not all PE's entered. The Apprentice
PROGRAM ERROR DETECTED BY APPRENTICE INSTRUMENTATION
yukon$ strings a.out.2
egrep -i "apprentice
vampir
pat"
head -5
VAMPIRtrace
VAMPIRtrace
VAMPIRtrace
VAMPIRtrace
VAMPIRtrace
Q: "I can't login! I keep on trying... The Kerberos (so-called) server
accepts my 'kerberos password,' asks for my 'card-code,' which
I enter, but then it says:
Enter Next Token:
I enter my SecurID PIN into my SecurID card (AGAIN), type the 'next
token' which appears on the card, but it doesn't work!"
(What should this person do?)
[ Answers, questions, and tips graciously accepted. ]
Current Editors:
E-mail Subscriptions:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
-
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
-
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
