| Newsletter Index | Quick-Tip Index | Search Newsletters |
To guarantee PEs for interactive work, we occasionally reduce the number available for batch jobs. To determine how many PEs are in the batch pool, use "qstat -m":
YUKON$ qstat -m
----------------------------------
NQS 3.3.0.5 BATCH QUEUE MPP LIMITS
----------------------------------
QUEUE NAME RUN QUEUE-PE'S R-PE'S R-TIME P-TIME
LIM/CNT LIM/CNT LIMIT LIMIT LIMIT
------------------ --- --- ------ ------ ------ ------ ------
[ ...snip... ]
------------------ --- --- ------ ------ ------ ------ ------
yukon 100/6 260/254
------------------ --- --- ------ ------ ------ ------ ------
The last line shows that, while 260 PEs are available to batch jobs,
254 are currently being used by batch jobs. The 260 batch PE limit is
normal.
When we switch to interactive mode, the batch PE limit will go as low as 252. For instance:
------------------ --- --- ------ ------ ------ ------ ------ yukon 100/6 252/248 ------------------ --- --- ------ ------ ------ ------ ------In this second example, "qstat -m" shows that, of 252 PEs permitted for batch work, 248 are being used by batch work.
However, this display doesn't tell us if the remaining 4 "batch" PEs have been claimed by interactive work. Use "grmap" or "grmview" to determine how many PEs are actually unused. If there are at least 4, then a 4 PE batch job could acquire them.
[ This is the second in a two-part series contributed by Brad Chamberlain of the University of Washington. ]
column column column
j-1 j j+1 j-1 j j+1 j-1 j j+1
i-1: w3 w2 w3 w2 w1 w2 w3 w2 w3
row i : res = w2 w1 w2 + w1 w0 w1 + w2 w1 w2
i+1: w3 w2 w3 w2 w1 w2 w3 w2 w3
plane: k k-1 k k+1
The result (res) is assigned the sum of w0 times the element aligned
with it, w1 times those that differ by 1 in one dimension, w2 times
those that differ by 1 in two dimensions, etc. In a C-like language,
this would be written in the following manner:
x[i,j,k] = w0 * y[i, j, k] +
w1 * (y[i-1, j, k] + y[i, j-1, k] + y[i, j, k-1] +
y[i+1, j, k] + y[i, j+1, k] + y[i, j, k+1]) +
w2 * (y[i-1, j-1, k] + y[i-1, j, k-1] + y[i, j-1, k-1] +
y[i-1, j+1, k] + y[i-1, j, k+1] + y[i, j-1, k+1] +
y[i+1, j-1, k] + y[i+1, j, k-1] + y[i, j+1, k-1] +
y[i+1, j+1, k] + y[i+1, j, k+1] + y[i, j-1, k+1]) +
w3 * (y[i-1, j-1, k-1] + y[i+1, j-1, k-1] +
y[i-1, j-1, k+1] + y[i+1, j-1, k+1] +
y[i-1, j+1, k-1] + y[i+1, j+1, k-1] +
y[i-1, j+1, k+1] + y[i+1, j-1, k+1]);
Multigrid computations are those that solve a coarse approximation of a
problem in order to arrive at a solution for the original problem more
quickly. Typically this involves a hierarchical array divided into
"levels", each of which has half the elements per dimension as the
previous level. Thus if the original problem had a scale of 8x8x8, the
coarser levels might be 4x4x4, 2x2x2, and 1x1x1. In a dense multigrid
problem, every point at each level is utilized. Sparse multigrid
problems are those that refine only certain interesting regions of the
problem space, and will not be discussed here.
For NAS MG, 27-point stencils within a level might look like the above, whereas those that move between levels might have to scale the indexing expressions in order to refer to elements that are in the corresponding locations in the finer or coarser grid.
Challenges to efficiently parallelizing multigrid computations can be grouped into two broad categories: load balancing and managing the gory details. In order for the problem to be load balanced, it is important that every level of the hierarchy be divided as evenly as possible between the processors. While this is conceptually simple, it leads to some interesting questions about how to index the arrays at each level. For example, do you declare your arrays to have an upper bound that's half as big for each level, or so that each level is strided by twice as much?
[1:8, 1:8, 1:8], [1:4, 1:4, 1:4], [1:2, 1:2, 1:2], etc.
OR
[1:8, 1:8, 1:8], [1:8:2, 1:8:2, 1:8:2], [1:8:4, 1:8:4, 1:8:4], etc.
This choice probably depends on the characteristics of the language
you're using and to a lesser degree on personal preference. For
example, ZPL's performance model specifies that the latter will result
in better performance, though the former is still an option. In
contrast, F90+MPI favors the former due to the fact that it is based on
a local view.
"Managing the details" includes exchanging boundary values with neighboring processors so that the stencil computations can run completely in parallel. Keep in mind that as the problem gets coarser and coarser, a desired value may be located on a processor other than those that are adjacent to you in the virtual processor grid. In addition, the details of controlling loop bounds for each processor at each level of the hierarchy (especially if the processors don't evenly divide the problem size) can become an issue.
With local-view approaches like F90+MPI and CAF, you must manage these details explicitly. In global-view languages like HPF and ZPL, the compiler will manage them for you, but you will probably want to pay attention to what it's doing to ensure that it's not incurring unnecessary overheads on your behalf. A language like ZPL eases this task by providing a syntax-level performance model for the user, whereas HPF tends to require post-execution profiling tools.
We'll start with ZPL since it is the most succinct:
ZPL:
procedure resid(var R,V,U: [,,] double);
begin
R := V - a[0] * U
- a[2]*(U@dir110{} + U@dir1N0{} + U@dirN10{} + U@dirNN0{} +
U@dir101{} + U@dir10N{} + U@dirN01{} + U@dirN0N{} +
U@dir011{} + U@dir01N{} + U@dir0N1{} + U@dir0NN{})
- a[3]*(U@dir111{} + U@dir11N{} + U@dir1N1{} + U@dir1NN{} +
U@dirN11{} + U@dirN1N{} + U@dirNN1{} + U@dirNNN{});
wrap_boundary(R);
end;
This procedure takes in the 3 argument arrays R, V, and U, specified as
being 3D, but with unspecified size. The first thing to note is that
there is no loop or explicit indexing associated with this statement.
In ZPL, the indices over which an array statement should execute is
specified in a scoped manner using a "region specifier". In this case,
no region specifier is specified within resid, so it is dynamically
inherited from the callsite. This encourages code reuse and makes
resid independent of the multigrid level.
The array statement expresses the stencil using the "@" operator which modifies an array reference by an offset vector called a "direction". These directions are declared globally by the user. For example:
direction dirN00{0..num_levels} = [-1, 0, 0] scaledby 2^{};
This declaration specifies a group of num_levels+1 directions, each of
which is scaled by twice the amount of the previous:
[-1, 0, 0], [-2, 0, 0], [-4, 0, 0], etc.The result is an offset per level in the hierarchy that can be used to refer to an element in the previous row but the same column and plane. In an array reference like:
U@dirN00{}
the "{}" inherits the direction's scale from U (it can also be
specified explicitly or relative to U's scale), and thus refers to the
element whose row index is just less than that referred to by R, V,
and U.
The ZPL compiler automatically generates vectorized communication for each @ reference, combining communications for vectors that overlap, such as dir100, dir110 and dir1N0, all of which require a plane of data from the "south". In addition, these @ operators provide a visual cue for ZPL users that point-to-point communication will most likely be required to implement the statement as specified by its performance model.
The procedure ends with a call to "wrap_boundary()" which uses ZPL's wrap statement to update the global boundary conditions. wrap_boundary is given in the next section.
F90+MPI / CAF:
subroutine resid( u,v,r,n1,n2,n3,a,k )
c -- some declarations omitted for brevity
integer n1,n2,n3,k
double precision u(n1,n2,n3),v(n1,n2,n3),r(n1,n2,n3),a(0:3)
double precision u1(m), u2(m)
do i3=2,n3-1
do i2=2,n2-1
do i1=1,n1
u1(i1) = u(i1,i2-1,i3) + u(i1,i2+1,i3)
> + u(i1,i2,i3-1) + u(i1,i2,i3+1)
u2(i1) = u(i1,i2-1,i3-1) + u(i1,i2+1,i3-1)
> + u(i1,i2-1,i3+1) + u(i1,i2+1,i3+1)
enddo
do i1=2,n1-1
r(i1,i2,i3) = v(i1,i2,i3)
> - a(0) * u(i1,i2,i3)
> - a(2) * ( u2(i1) + u1(i1-1) + u1(i1+1) )
> - a(3) * ( u2(i1-1) + u2(i1+1) )
enddo
enddo
enddo
call comm3(r,n1,n2,n3,k)
return
end
The F90 and CAF versions of the benchmark are identical (save for one
CAF line omitted here) due to the fact that all of the interprocessor
communication is abstracted into a subroutine called comm3 (included
below). This subroutine performs the 27-point stencil using an
optimization in which partial sums are calculated and stored in vectors
u1 and u2 to avoid redundant FLOPs. As mentioned in last week's
article, this is an important optimization and results in great benefit
for these implementations at the cost of obfuscating the code's
intent. Ideally, Fortran compilers would recognize this optimization
opportunity automatically, allowing the code to be written in a more
intuitive form (that would be very similar to the stencil code given in
the introduction).
Other than this optimization, there are no real surprises here. n1, n2, and n3 are the bounds of each processor's local block of data. The catch to this code is that comm3() is hiding all of the gory details, including the global and local boundary value updates (in this implementation the local communication is done after each computation as opposed to the demand-driven communcation used by ZPL and HPF).
HPF:
extrinsic (HPF) subroutine resid( u,v,r,n1,n2,n3,a,k )
c -- some declarations omitted for brevity
!hpf$ distribute(*,block) :: grid
double precision, intent (in) :: u(:,:,:),v(:,:,:),a(0:3)
double precision, intent (out) :: r(:,:,:)
!hpf$ align(*,:,:) with grid :: u,v,r
double precision u1(size(u,1)), u2(size(u,1))
!hpf$ independent, new(u1,u2), onhome(u(i1,i2,i3))
do i3=2,n3-1
do i2=2,n2-1
do i1=1,n1
u1(i1) = u(i1,i2-1,i3) + u(i1,i2+1,i3)
> + u(i1,i2,i3-1) + u(i1,i2,i3+1)
u2(i1) = u(i1,i2-1,i3-1) + u(i1,i2+1,i3-1)
> + u(i1,i2-1,i3+1) + u(i1,i2+1,i3+1)
enddo
do i1=2,n1-1
r(i1,i2,i3) = v(i1,i2,i3)
> - a(0) * u(i1,i2,i3)
> - a(2) * ( u2(i1) + u1(i1-1) + u1(i1+1) )
> - a(3) * ( u2(i1-1) + u2(i1+1) )
enddo
enddo
enddo
r(n1,:,:) = r(2,:,:)
r(1,:,:) = r(n1-1,:,:)
r(:,n2,:) = r(:,2,:)
r(:,1,:) = r(:,n2-1,:)
r(:,:,n3) = r(:,:,2)
r(:,:,1) = r(:,:,n3-1)
return
end
On the surface, the HPF code is very similar to that of the F90+MPI
code. The biggest conceptual difference is that n1, n2, and n3 no
longer refer to a processor's local bounds, but rather to the global
bounds of the current level. HPF directives are specified to ensure
that the arrays are distributed and aligned as necessary to minimize
communication, though HPF makes no guarantees about how these
directives will be implemented or even that they will be followed at
all. The advantage over F90+MPI and CAF is that no communication code
is required. The disadvantage compared to ZPL is that HPF has no
performance model, and thus, no communication style or quanitity is
guaranteed, forcing the programmer to tune according to their compiler.
The last six F90 statements update the global boundary conditions as in ZPL's call to wrap_boundary().
ZPL
procedure wrap_boundary(var X:[,,] double);
begin
[dir100{} of "] wrap X;
[dirN00{} of "] wrap X;
-- similar statements here for the other 23 directions, omitted for
-- brevity
[dirNNN{} of "] wrap X;
end;
The ZPL implementation of resid() uses a call to wrap_boundary (as do
all of the other stencil operations) in order to update the global
boundary conditions. This routine opens a region specifier for each
statement using ZPL's "of" region operator. This operator uses a
direction vector to create a new region adjacent to the base region in
the direction specified and is useful for specifying a problem's
boundary conditions. In this case, the base region is `"', indicating
it should be dynamically inherited from the callsite. The wrap
statement assigns values to the array X within the region such that
they are periodic with respect to the base region.
One of ZPL's primary goals is to reduce tedious, error-prone programming. While the @ operator and wrap statement have had this benefit in 2D problems, the use of 27 directions in NAS MG has demonstrated that there is still some room for making the programmer's job even easier, even though ZPL is still more concise than sequential Fortran or C (and significantly moreso than HPF, CAF, or F90+MPI).
F90+MPIThe update of each processor's local boundary values in the F90+MPI resid is implemented using 4 main routines: comm3, ready, give3, and take3. comm3 is the top-level routine which calls the others, ready posts non-blocking MPI receives, give3 marshalls outgoing data and posts MPI sends, and take3 waits for the receives to complete and unmarshalls the data. These routines involve 250+ lines of code, so will be liberally condensed here.
c -- COMM3
subroutine comm3(u,n1,n2,n3,kk)
c -- declarations omitted
if( .not. dead(kk) )then
do axis = 1, 3
if( nprocs .ne. 1) then
call ready( axis, -1 )
call ready( axis, +1 )
call give3( axis, +1, u, n1, n2, n3, kk )
call give3( axis, -1, u, n1, n2, n3, kk )
call take3( axis, -1, u, n1, n2, n3 )
call take3( axis, +1, u, n1, n2, n3 )
else
call comm1p( axis, u, n1, n2, n3, kk )
endif
enddo
else
call zero3(u,n1,n2,n3)
endif
return
end
c -- READY
subroutine ready( axis, dir )
c -- declarations omitted
buff_id = 3 + dir
buff_len = nm2
do i=1,nm2
buff(i,buff_id) = 0.0D0
enddo
msg_id(axis,dir,1) = msg_type(axis,dir) +1000*me
call mpi_irecv( buff(1,buff_id), buff_len,
> dp_type, mpi_any_source, msg_type(axis,dir),
> mpi_comm_world,msg_id(axis,dir,1),ierr)
return
end
c -- GIVE3
subroutine give3( axis, dir, u, n1, n2, n3, k )
c -- declarations omitted
buff_id = 2 + dir
buff_len = 0
c -- THE FOLLOWING MOTIF REPEATS 3 TIMES FOR THE 3 DIMENSIONS
if( axis .eq. 1 )then
if( dir .eq. -1 )then
do i3=2,n3-1
do i2=2,n2-1
buff_len = buff_len + 1
buff(buff_len,buff_id ) = u( 2, i2,i3)
enddo
enddo
call mpi_send(
> buff(1, buff_id ), buff_len,dp_type,
> nbr( axis, dir, k ), msg_type(axis,dir),
> mpi_comm_world, ierr)
else if( dir .eq. +1 ) then
do i3=2,n3-1
do i2=2,n2-1
buff_len = buff_len + 1
buff(buff_len, buff_id ) = u( n1-1, i2,i3)
enddo
enddo
call mpi_send(
> buff(1, buff_id ), buff_len,dp_type,
> nbr( axis, dir, k ), msg_type(axis,dir),
> mpi_comm_world, ierr)
endif
endif
return
end
c -- TAKE3
subroutine take3( axis, dir, u, n1, n2, n3 )
c -- declarations omitted
call mpi_wait( msg_id( axis, dir, 1 ),status,ierr)
buff_id = 3 + dir
indx = 0
c -- THE FOLLOWING MOTIF REPEATS 3 TIMES FOR THE 3 DIMENSIONS
if( axis .eq. 1 )then
if( dir .eq. -1 )then
do i3=2,n3-1
do i2=2,n2-1
indx = indx + 1
u(n1,i2,i3) = buff(indx, buff_id )
enddo
enddo
else if( dir .eq. +1 ) then
do i3=2,n3-1
do i2=2,n2-1
indx = indx + 1
u(1,i2,i3) = buff(indx, buff_id )
enddo
enddo
endif
endif
return
end
CAF
The CAF implementation of communication is very similar to that of
F90+MPI with two major exceptions: (1) there is no ready() subroutine
since all CAF communication is one-sided, and (2) CAF's syncronization
primitives are used to ensure that communication has completed before
proceeding
subroutine comm3(u,n1,n2,n3,kk)
c -- declarations omitted
if( .not. dead(kk) )then
do axis = 1, 3
if( nprocs .ne. 1) then
call sync_all()
call give3( axis, +1, u, n1, n2, n3, kk )
call give3( axis, -1, u, n1, n2, n3, kk )
call sync_all()
call take3( axis, -1, u, n1, n2, n3 )
call take3( axis, +1, u, n1, n2, n3 )
else
call comm1p( axis, u, n1, n2, n3, kk )
endif
enddo
else
do axis = 1, 3
call sync_all()
call sync_all()
enddo
call zero3(u,n1,n2,n3)
endif
return
end
subroutine give3( axis, dir, u, n1, n2, n3, k )
c -- declarations omitted
buff_id = 2 + dir
buff_len = 0
c -- THE FOLLOWING MOTIF REPEATS 3 TIMES FOR THE 3 DIMENSIONS
if( axis .eq. 1 )then
if( dir .eq. -1 )then
do i3=2,n3-1
do i2=2,n2-1
buff_len = buff_len + 1
buff(buff_len,buff_id ) = u( 2, i2,i3)
enddo
enddo
buff(1:buff_len,buff_id+1)[nbr(axis,dir,k)] =
> buff(1:buff_len,buff_id)
else if( dir .eq. +1 ) then
do i3=2,n3-1
do i2=2,n2-1
buff_len = buff_len + 1
buff(buff_len, buff_id ) = u( n1-1, i2,i3)
enddo
enddo
buff(1:buff_len,buff_id+1)[nbr(axis,dir,k)] =
> buff(1:buff_len,buff_id)
endif
endif
return
end
<strong>
take3 in CAF:</strong>
subroutine take3( axis, dir, u, n1, n2, n3 )
c -- declarations omitted
buff_id = 3 + dir
indx = 0
c -- THE FOLLOWING MOTIF REPEATS 3 TIMES FOR THE 3 DIMENSIONS
if( axis .eq. 1 )then
if( dir .eq. -1 )then
do i3=2,n3-1
do i2=2,n2-1
indx = indx + 1
u(n1,i2,i3) = buff(indx, buff_id )
enddo
enddo
else if( dir .eq. +1 ) then
do i3=2,n3-1
do i2=2,n2-1
indx = indx + 1
u(1,i2,i3) = buff(indx, buff_id )
enddo
enddo
endif
endif
return
end
ZPL:
procedure rprj3(var S,R: [,,] double);
begin
S := 0.5000 * R +
0.2500 * (R@dir100{} + R@dir010{} + R@dir001{} +
R@dirN00{} + R@dir0N0{} + R@dir00N{}) +
0.1250 * (R@dir110{} + R@dir1N0{} + R@dirN10{} + R@dirNN0{} +
R@dir101{} + R@dir10N{} + R@dirN01{} + R@dirN0N{} +
R@dir011{} + R@dir01N{} + R@dir0N1{} + R@dir0NN{})+
0.0625 * (R@dir111{} + R@dir11N{} + R@dir1N1{} + R@dir1NN{} +
R@dirN11{} + R@dirN1N{} + R@dirNN1{} + R@dirNNN{});
wrap_boundary(S);
end;
In F90+MPI/CAF, the code is almost identical as well, except that the
indexing expressions on r are now multiplied by 2 in order to achieve
the difference in scale.
In HPF, however, the code becomes significantly more complex due to the effort required to properly align the different levels such that the load remains balanced and communication is minimized, requiring ~100 additional lines of code.
language lines decls comp comm
-------- ----- --------- --------- ------
F90+MPI 992 168 (16%) 237 (23%) 587 (59%)
CAF 1150 243 (21%) 238 (20%) 669 (58%)
HPF 433 129 (29%) 304 (70%) 0 ( 0%)
ZPL 192 90 (46%) 102 (53%) 0 ( 0%)
Having spent some time looking at the code, the difference between
expressiveness in the local view and global view has made itself
apparent. In particular, we have seen that the communication code
required by F90+MPI and CAF (only part of which was shown here, and
even that condensed by 1/3) is not only long, but intricate. Anyone
with experience debugging parallel programs knows that getting a set of
processors working together correctly in a single-level code, let alone
a hierarchical multigrid code, can be frustrating and time-consuming.
This motivates the design of higher-level languages which take care of
those details for you like HPF and ZPL. The question is whether the
language provides the desired expressiveness and if the compiler
generates adequate parallel performance. In HPF, you may need to spend
time profiling and getting compiler feedback, and even then may not
have a code that runs efficiently with a different platform or
compiler. In ZPL, the goal is to provide portable performance by
supplying a syntax-based performance model with which the user can
understand the parallel implementation of their code.
For more information on this work, contact:
For more on the languages, see:
MPI: http://www-unix.mcs.anl.gov/mpi/index.htmlThe HPF code excerpted within was developed at NASA Ames and is described in their IPPS`99 paper:
CAF: http://www.co-array.org/
HPF: http://www.crpc.rice.edu/HPFF/home.html
ZPL: http://www.cs.washington.edu/research/zpl/
"Implementation of NAS Parallel Benchmarks in High Performance Fortran"Thanks go to NASA Ames for allowing the code to be excerpted in this article.
Michael Frumkin, Haoqiang Jin, and Jerry Yan
IPPS `99
The project was supported by the ACM's Fortran Forum, and conducted by Niki Reid of The Queen's University of Belfast. The survey is now closed--results and analysis have been published by the Fortran Forum and are available on-line, at:
http://www.cs.qub.ac.uk/~N.Reid
Here are exceprts:
An analysis of Fortran utilisationN. Reid and J.P. Wray
School of Computer Science
The Queen's University of Belfast
Belfast BT7 1NN
email: niki.reid@acm.org or jp.wray@qub.ac.uk
[ ... ]
2. Languages in Use
A majority of respondents (61%) are still actively coding in Fortran 77, although only a minority (15%) are using it as their primary programming language. The vast majority of users (92%) have upgraded to the more recent dialects of Fortran 90 and Fortran 95 (80% and 51% respectively, where 42% of those respondents who have upgraded using both dialects). More interesting was the fact that a considerable number of respondents (61%) who are coding in Fortran 77 appear to be using the compilers for these later dialects for the compilation of their Fortran 77 code. This is certainly what the designers of the language were intending, by the retention of Fortran 77 as a strict subset.
[ ... ]
Given the general perception of Fortran users refusing to use any language other than Fortran, it came as a surprise to discover that over three-quarters of respondents were, in fact, using Fortran in conjunction with other languages. Principal among these were C(47%), C++(26%) and Visual Basic(19%) along with a conglomerate of other languages. Comments provided indicate that Fortran is being used to provide the 'number crunching' facilities behind programs written in these other languages.
[ ... ]
The Fortran Standard's committee has indicated its intention to remove features (listed in the Fortran Standard [ISO/IEC 1997] as 'Obsolescent Features') for which alternate methods of implementation have been provided. The survey has, however, shown that the users' requirement of conformance with Fortran 77, and their desire to use the compilers of post-Fortran 77 dialects for compilation of Fortran 77 code, will present an obstacle to such a 'cleaning' exercise.
[ ... ]
3.1 Parallelism
Just under a quarter of the respondents were using parallel architecture machines (23%). The breakdown of parallel machine users by the user's machine memory model is as follows:
Shared Distributed Both Shared & Distributed 59% 41% 24%Of the supercomputing machines in use Cray had attracted a most significant slice of the market (94%). Of significant interest, among these groups, was that while the vast majority of distributed users were using MPI (92%), only 17% of parallel users were using HPF. Since the vast majority of HPF users were utilising both parallel architectural models no further characterisation of HPF can be carried out here.[ ... ]
A: {{ I tried to authenticate using Kerberos/SecurID and got this message:
{{
{{ kerberos skew too great
{{
{{ What does this mean?
Thanks go to Kevin Kennedy for the answer:
The local clock time differs by more than N number of seconds from
the kerberos server. If you reset the clock/time on your local
machine this problem will go away.
Q: I love the T3E, but sometimes I relax with me olde Cray PVP.
How can I estimate my job's memory utilization so I can make an
accurate NQS request (and get my job to start sooner)?
[ Answers, questions, and tips graciously accepted. ]
Contact:
Donald Bahls ARSC User Consultant ph: 907-450-8674 Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
Send comments and questions to the current editors using this Contact Form.E-mail Subscriptions:
| Newsletter Index | Quick-Tip Index | Search Newsletters |
Arctic Region Supercomputing Center
PO Box 756020, Fairbanks, AK 99775 | voice: 907-450-8600 | email:
home | search | about | support | news | science | resources