ARSC HPC Users' Newsletter 273, July 25, 2003
One Day X1 Seminar, Next Tuesday, July 29
John Levesque, a senior analyst at Cray Inc., will be on-site all next week. He'll be giving a one-day class on X1 technology, usage, and code optimization:
UAF Campus, Butrovich Building, Room #109 Tuesday, July 29th 9:00am - 4:00pm
ARSC, UAF, and HPCMP researchers are invited to attend.
Speeding up Your FTP Transfers: Part II
[ Thanks to Nathan Bills, ARSC Network Specialist for contributing this 2 part series. ]
Last time we looked at a simple way of increasing ftp throughput. We saw that if Unix system default settings have small buffer sizes for transmit and receive buffers, this translates to small transmit and receive windows, which results in data sent across a wide area network in unfortunately small chunks. This produces dead time between chunks as the sender waits for acknowledgments back from the receiver.
We used this simple formula,
<window size in bytes>/window * windows/sec = bytes/secto determine what throughput rate we could get over a wide-area network where the windows/sec was calculated with,
1 window / <time it takes, in seconds, to send one window of data>While this is useful for determining how much information can be transmitted based on how many windows can be sent per second, it does not, as one reader pointed out, take into account the bandwidth limitation of the connection. Thus, with our formula above, we could increase the window size to infinity and get basically:
<infinite window size in bytes>/window * windows/sec =
infinite bytes/sec.
which would be excellent--if we lived in a perfect world. However, we are usually limited by a certain bandwidth like the 100Mbits/sec of a 10/100Base-T connection, or 1Gbits/sec for Gigabit Ethernet, or 155 Mbits/sec for an OC3 ATM link.
How does one include this limitation? As always, our goal is to send as much data as we can as quickly as we can. If we have a 100 Mbits/sec connection, we want to be able to send at that rate, right?. We could try to just shove data across the network at the full 100 Mbits/sec but the link may not be reliable and we might lose some of it. Using TCP to reasonably assure delivery of the data, which many applications like ftp, www, email do, the sender of the data will wait, after it has transferred the data, for the acknowledgment of that first part of the data, before sliding its 'send window' forward to send more data. As we noted last time, this delay in the time it takes for data to be sent and the acknowledgment to come back is the round-trip-time.
We would like to keep the pipe full all the time, but if our TCP window is too small and this round-trip-time is large, then there will be a gap in transmission while the sender waits for the acknowledgment to come back. If we are able to send at the highest data rate the whole time it takes for the initial data to get to the receiver and an acknowledgment to come back, we would get,
bandwidth * round-trip-time =
amount of data that can be sent in that round-trip-time
which would keep the pipe full because the sender gets the acknowledgment back from the receiver just at the time it has reached the end of its send window and moves the window forward to send more data.
For example, if it took five seconds for the acknowledgment come back from a receiver to the sender, and the connection is a 100 Mbits/sec connection, we would be able to send,
100 Mbites/sec * 5 seconds = 500 Mbits or 62.5 Mbytesof data during that time. Note that this is the 'window' of information that can be sent during the five seconds of delay between first data sent and the first acknowledgment received and is what we would try to set for our window or buffer size on the system. This result is called the bandwidth*delay product (pronounced 'bandwidth-delay product' rather than 'bandwidth-times-delay product' :) )
In kerberos ftp, this would mean we would set the buffer sizes to 62.5 Mbytes:
ftp> lbufsize 65536000 ftp> rbufsize 65536000Five seconds is quite a large delay and it is more common to see a delay between 50-200 milliseconds on the Internet. If we have a 100 Mbit/sec connection and the round trip time is 100 milliseconds, or 0.1 seconds, our bandwidth*delay product would be:
100 Mbits/sec * 0.1 seconds = 10 Mbits or 1.25 Mbytes
and we would set our window size in ftp accordingly:
ftp> lbufsize 1310720 ftp> rbufsize 1310720to try to send data at the full 100 Mbits/sec, or 12.5 Mbytes/sec. If we could keep that rate going we could transfer a 100-Gbyte file in,
100 Gbytes / (12.5 Mbytes/sec) = 8192 seconds or 2.28 hours.Not bad, eh? Note that our selection of a 1-Mbyte window the last time was close to this size.
This covers the simple aspects of sending data at or almost the full data rate. There are still a lot of other things that could affect your transfer rates such as the communications links between you and the remote end, the effect of data transmission errors on your data rates, system resource issues at either end, the effect of other people's transfers on yours, etc. But this is a good start at speeding up those ftp transfers.
For further information about increasing performance of data transfers, check out these urls:
http://sd.wareonearth.com/woe/Briefings/tcptune/tsld001.htm http://www.networkcomputing.com/1013/1013ws1.html http://dast.nlanr.net/Projects/FTP.html http://dast.nlanr.net/Guides/GettingStarted/TCP_window_size.html http://moat.nlanr.net/NATimes/NAT.1.2/phil.htm http://www.psc.edu/networking/perf_tune.html
ARSC Advanced Display Environments Workshop
As staff and researchers gain experience with ARSC's new four-walled immersive environment, the Discovery Lab, we continue looking to the future of visualization as an aid for analysis and expression of computational results.
To this end, ARSC is sponsoring an "Advanced Display Environments Workshop," here at UAF. The schedule (still subject to minor changes) is posted below. Sessions are open to UAF, ARSC, and HPCMP researchers. Please contact Jon Genetti (ffjdg@uaf.edu) in advance if you are interested in attending.
--
Advanced Display Environments Workshop Aug 6-8, 2003
Wednesday, Aug 6
109 Butrovich
8:15 - 8:45 Coffee
8:45 - 9:00 Welcome
9:00 - 10:00 Chandrajit Bajaj, UT-Austin, Curved Powerwall
10:00 - 10:15 Coffee Break
10:15 - 11:15 John Clyne, NCAR, Stereo and Collaboration
11:15 - 12:15 John Moreland, SDSC, High-density Tiled Display
12:15 - 1:30 Catered Lunch
1:30 - 2:30 Randy Frank, LLNL, Tera-scale Data on Tiled Displays
2:30 - 3:30 Claudio Silva, OHSU, Massive Polygonal Rendering
3:30 - 3:45 Coffee Break
3:45 - 4:45 Sam Uselton, Consultant, The Future Office
5:30 - 7:30 Group dinner at The Pump House
Thursday, Aug 7
Discovery Lab, 375C Rasmusson Library
8:15 - 8:45 Coffee
8:45 - 9:30 Christoph Sensen, U Calgary, A Java 3D-Enabled CAVE
9:30 - 10:15 Greg Johnson, TACC, Depth Perception in Immersive Environments
10:15 - 10:30 Coffee Break
10:30 - 11:15 Eric Wernert, U Indiana, Display Needs for Diverse Applications
11:15 - 12:00 Craig Stewart, U Indiana, Bio Computation and Storage
12:00 - 1:30 Lunch at Pike's
GI Globe Room, Elvey
1:30 - 2:00 Panel - Projector/Display Technologies To Watch
(Genetti, Johnson, Moreland, Uselton)
2:00 - 2:30 Panel - How important is pixel density? How many pixels
are enough?
(Frank, Johnson, Moreland, Wernert)
2:30 - 3:00 Panel - How important is stereo/immersion?
(Clyne, Johnson, Sensen, Wernert)
3:00 - 3:15 Coffee Break
3:15 - 4:00 Panel - Flat vs. curved vs. cave vs. ???
(Bajaj, Clyne, Moreland, Uselton)
4:00 - 4:30 Panel - Image generators and data handling requirements
(Bajaj, Clyne, Frank, Silva)
4:30 - 5:00 Panel - Five years from today ...
(Bajaj, Frank, Silva, Uselton)
5:30 - 7:00 No host dinner at Alaska Salmon Bake
Friday, Aug 8
204 Butrovich
8:30 - 9:00 Coffee
9:00 - 12:00 Working group develops recommendations and white paper
12:00 - 1:00 Lunch / Presentation of recommendations
Room TBA
8:30 - 12:00 Biomedical Birds-of-feather meeting
--
As a reminder, ARSC's regular summer tours are held in the Discovery Lab. Every Wednesday, 1pm, through August. For more info on all summer tours at UAF, see:
http://www.uaf.edu/univrel/Tour/tours.html
For more on the Discovery Lab:
Quick-Tip Q & A
A:[[ There are commands I'd like to issue to ftp, whenever I use
[[ it. For instance, "idle 7200" and the lbufsize/rbufsize settings.
[[ Can I do this automatically, without typing them at the ftp prompt
[[ every single time?
#
# First, Rich Griswold contributes this Newsletter's first expect script:
#
You can use an expect script to do this:
#!/usr/local/bin/expect -f
spawn ftp $argv
expect {
"Name" {
expect_user -re "(.*)\n"
send "$expect_out(1,string)\r"
exp_continue
} "Password:" {
stty -echo
expect_user -re "(.*)\n"
send "$expect_out(1,string)\r"
stty echo
exp_continue
} "ftp>" {
send "idle 7200\r"
# Add other commands here...
}
}
interact
#
# Thanks to Jeff McAllister:
#
The kftp utility can accept a Unix input stream, e.g.,
"kftp < kftp.script"
Thus, in many cases, you can completely eliminate interactive FTP
sessions. The script file can contain any commands that you would
type in, separated by newlines ("\n"). You should include the FTP
command, "prompt," to toggle prompting off.
To transfer the files 'test.out.*' to $ARCHIVE_HOST you could create a
wrapper script to manage the entire process. In this example,
"batch_ftp.ksh" does the following:
1) creates a temporary file, "kftp.script," containing the commands
kftp will execute as it reads them from the input stream
2) executes kftp, taking commands from "kftp.script"
3) removes "kftp.scrpt"
File: "batch_ftp.ksh":
----------------------
#!/bin/ksh
echo "open $ARCHIVE_HOST" > kftp.script
echo "" >> kftp.script
echo "lbufsize 1000000" >> kftp.script
echo "rbufsize 1000000" >> kftp.script
echo "binary" >> kftp.script
echo "prompt" >> kftp.script
echo "cd $ARCHIVE" >> kftp.script
echo "mput test.out.*" >> kftp.script
echo "quit" >> kftp.script
kftp < kftp.script > /dev/null 2&>1
rm kftp.script
----------------------
This example assumes that the environment variables exist.
$ARCHIVE_HOST is the name of the remote host. $ARCHIVE is the path to
the destination directory on the remote host.
($ARCHIVE and $ARCHIVE_HOST are part of a common set of environment
variables, now available on all ARSC systems, designed to standardize
our storage environment. We recommend using these environment
variables to "hide" the details of the specific storage setup on each
machine. Thus scripts can be moved between machines with less work.)
#
# And finally, the editor's solution...
#
FTP's built-in auto-login process will do the trick.
In your $HOME/.netrc file, specify each machine you ftp with, your
login, but NO passwords (passwords should never be stored in files, on
sticky notes, etc., for obvious reasons). Then, for each machine
separately, use "macdef" to define an "init" macro--and other macros,
if you wish.
For example:
CHILKOOT$
CHILKOOT$ cat $HOME/.netrc
machine rimegate.arsc.edu login arscfrb
macdef init
idle 7200
rbufsize 1000000
lbufsize 1000000
macdef cdt
cd /scratch/arscfrb
pwd
ls
machine klondike.arsc.edu login fred
macdef init
idle 7200
rbufsize 1000000
lbufsize 1000000
macdef cdt
cd /tmp/fred
pwd
ls
CHILKOOT$
CHILKOOT$
# Here's a test. Note that "init" is executed automatically, and that
# other FTP macros are executed with the command "$":
CHILKOOT$ ftp klondike
Connected to klondike.arsc.edu.
220 klondike FTP server (Version 5.60) ready.
334 Using authentication type GSSAPI; ADAT must follow
GSSAPI accepted as authentication type
GSSAPI authentication succeeded
232 GSSAPI user fred@ARSC.EDU is authorized as fred
idle 7200
200 Maximum IDLE time set to 7200 seconds
rbufsize 1000000
200 TCP buffer size set to 1000000 bytes
lbufsize 1000000
Set local TCP buffer size to 1000000 bytes
Remote system type is UNIX.
Using binary mode to transfer files.
ftp>
ftp>
ftp> $
(macro name) cdt
cd /tmp/fred
250 CWD command successful.
pwd 257 "/tmp/fred" is current directory.
ls
200 PORT command successful.
150 Opening ASCII mode data connection for /bin/ls.
total 82512
drwx------ 16 staff 4096 Jul 21 10:50 Progs
drwx------ 2 staff 32 Jul 17 11:19 Scripts
226 Transfer complete.
ftp>
Q: Any "vi" experts out there? I'm editing a text file, each line starts
with a version number followed by a space and then a word. E.g.,
...
33 jade
33.8.2 jasper
10 javelin
7.1 javelina
22 juniper
...
Can I move the version numbers to the ends of the lines? Like this:
...
jade 33
jasper 33.8.2
javelin 10
javelina 7.1
juniper 22
...
Thought I was getting good at vi regexp's, but this is a stumper! If
it's impossible in vi, maybe there's another way.
[[ Answers, Questions, and Tips Graciously Accepted ]]
Current Editors:
E-mail Subscriptions:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669 Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678 Arctic Region Supercomputing Center University of Alaska Fairbanks PO Box 756020 Fairbanks AK 99775-6020
-
Subscribe to (or unsubscribe from) the e-mail edition of the
ARSC HPC Users' Newsletter.
-
Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
