[Menu Bar] Resourses at ARSC Science at ARSC Newsroom Support About ARSC ARSC Home

 

ARSC HPC Users' Newsletter 369, September 07, 2007

Newsletter Index Quick-Tip Index Search Newsletters

Contents

 

Multicore and New Processing Technologies Review - Part II

[[ Editors Note:
[[
[[ Part I of this series gave an overview of the ARSC Multicore and New
[[ Processing Technologies Symposium, which was held August 13th and 
[[ 14th.  In Part II, we present the relatively subjective observations 
[[ and recommendations of the attendees.
[[
[[ Part I can be found here:
[[  http://www.arsc.edu/support/news/HPCnews/HPCnews368.shtml#article3
[[


[ by Greg Newby ]

Symposium attendees agreed that sub-linear performance increases will only worsen as the number of cores continues to grow. Caching, memory access, and many shared resources such as the system bus and interconnect, will not scale as easily as the number of processor cores. This will have major implications for HPC users, who will often need to have a detailed understanding of the processor layout and other system characteristics in order to get maximum performance. Most HPC developers have experienced this before, with technologies such as vector processors, and during transitional periods such as from 32-bit to 64-bit processors.

Today, most HPC applications use MPI to parallelize. Symposium attendees recognized the need for new types of processors to continue to support MPI. At the same time, they see better prospects for programming next-generation systems emerging. The NSF's recent PetaApps grant program, for example, specifies the need for new programming paradigms and languages for using tens or hundreds of thousands of processors for a single job. Several languages are under development, including Unified Parallel C (UPC), Chapel (from Cray), Co-array Fortran, and X10 (from IBM). These allow the programmer to consider a large-scale computational cluster as a more unified system, with shared memory.

Attendees made the following major recommendations, directed at the computing industry, HPC users, and agencies such as the Department of Defense, National Science Foundation, and Department of Energy:

Additional recommendations include:

 

Subversion Usage at ARSC

  [ by Anton Kulchitsky ]

This article will describe:

  1. Why Subversion should be used
  2. Working with Subversion
  3. Setting up a Subversion repository for yourself
  4. Setting up a Subversion repository for an ARSC project

1. Why use it?

Epigraph:
"[Use it] Always. Even if you are a single-person team on a one-week project. Even if it's a "throw-away" prototype. Even if the stuff you're working on isn't source code. Make sure that everything is under the source code control - documentation, phone number lists, memos to vendors, makefiles, build and release procedures, that little shell script that burns the CD master - everything. ... Even if we're not working on a project, our day-to-day work is secured in a repository." ("The pragmatic programmer" by Andrew Hunt and David Thomas, Addison-Wesley 1999)

There are many issues that Subversion intends to address.

The version control system is more reliable, easy to backup, safer to use and harder to destroy. It also improves self-organization. It helps developers communicate and discuss issues. It can prevent a conflicting access to the same document. It can be integrated into IDE or editor, and so on and so forth.

See HPC Users' Newsletter issue 278 Quick-Tip for more information: http://www.arsc.edu/support/news/HPCnews/HPCnews358.shtml#article3.

2. Working with Subversion.

For details on using Subversion, please see the excellent documentation available online at:

   http://svnbook.red-bean.com/

To understand how to work with Subversion, it is most important to understand some basic principles.

Subversion cares about permissions, so you may run into problems if the permissions on the repository were not set up properly (see below).

When you make changes to your working copy they are not available for any other working copies (either yours or others) unless you successfully commit changes to the repository. You and others can simultaneously work on the same code in different working copies. Thus, you may even make changes on the same file. Usually these changes do not conflict and Subversion can automatically merge such changes. If your changes do conflict with those in the repository, Subversion will not allow you to commit changes unless you resolve this conflict.

In some cases you may want to "lock" a file you are working on, so subversion will not allow anyone to commit a modification to that file unless you "unlock" it again. You should do locking only for binary images you are changing because it can be very hard to resolve conflicts with them.

Here are some basic commands you need to know when using Subversion:

The structure of your repository can be as complex as you like. However, there are some standard Subversion conventions you should follow. Every sub-project in your repository should first have three subdirectories: trunk, branches, and tags. Trunk is the main developmental line. Branches are some extra copies of your sub-project that reflect usually some big new features or big modifications to the main line. Tags are usually related to main releases. Subversion smartly does not save every copy but rather stores and manages changes between copies. Thus, creating a new branch or tag does not actually duplicate the code.

Suppose your project OUR_PROJECT has a website, a model, utilities, and scripts associated with it. In this case your repository structure can be the following:

 repository/website
 repository/website/trunk
 repository/website/branches
 repository/website/tags

 repository/utilities
 repository/utilities/utility1
 repository/utilities/utility1/trunk
 repository/utilities/utility1/branches
 .... 

It is very easy to start to work with Subversion and get many important benefits right away. However, for even more benefits it is worth reading the SVN Book and start using more advanced features of Subversion. Please consider branching and propset as next topics after you became familiar with basic commands.

3. Setting up Subversion for yourself

Epigraph: "I always work in a team of at least two - myself now, and myself later. This other guy is often the hardest to work with - I can't convince him that he doesn't know I'm talking about." (Jaime Metcher, 20/07/07 01:15, somewhere on the Internet)

The way to set up Subversion only for your personal work at ARSC is to use the $ARCHIVE directory. It is highly reliable storage with backups. Thus, it is probably the best place to keep your repository, isn't it? To set it up:

Setting up the Subversion for an ARSC project.

Suppose you need a repository for your project. In this case you should do the following.

  • By request of the project PI, ARSC can create a project directory, /projects, e.g., for the project "MY_PROJECT," the directory would be /projects/MY_PROJECT. Its owner will be the project PI and its group, the project group.

    Only members of the project group will be able to work with the repository.

    For a small project, you also may create a repository in your $ARCHIVE directory and follow the steps below.

    You need to run the following commands on the machine where /projects/MY_PROJECT directory is mounted locally.
  • Create a repository

       svnadmin create /projects/MY_PROJECT/svnrepo --fs-type fsfs
  • Change the group of your repository to your common group (YOUR_COMMON_GROUP in the example below) which may not be a primary group.

       chgrp -R YOUR_COMMON_GROUP /projects/MY_PROJECT/svnrepo
  • Change the permissions of the repository so your group has the same permissions as you:
        for perm in 4 6 7;
              do
                 find /projects/MY_PROJECT/svnrepo -perm ${perm}00
                                 -exec chmod ${perm}${perm}0 {} \;
              ;done
    
  • Set a special bit for all directories so all new subdirectories and files will automatically belong to the appropriate group:

       find /projects/MY_PROJECT/svnrepo -type d -exec chmod g+s {} \;

    ** WARNING: It's against ARSC's security policy to set the SGID bit
    ** on any file. You can set it on a directory as described here, but
    ** don't set it on anything else.

  • All developers from the group are ready to work with the repository now. Developers can now import files (see above).

    The project directory might not be available from all systems as $ARCHIVE. In this case, you need to use svn+ssh type URLs to access it.

     

    Introduction to strace, Part II

      [ by Craig Stephenson ]
    

    In Part I of Introduction to strace, I used several examples to show how strace can help locate some of the more elusive problems you might encounter during program execution.

       http://www.arsc.edu/support/news/HPCnews/HPCnews361.shtml#article1

    All of the examples used in Part I dealt with diagnosing problems in the execution of a serial code. Chances are this tool would be just as useful, if not more, to diagnose problems in the execution of parallel code. This article illustrates how to use strace with MPI, as well as how to use strace to identify bottlenecks in a code.

    To many of us, it is a bit of a mystery what happens to our jobs once they are submitted to the internal nodes of a cluster. Suppose you submit a job to run an executable that has already worked several times in the past. The executable has not changed, but for one reason or another, your new job ends prematurely and produces no useful output. What can you do? You can start by trying to piece together the implications of the vague, often ambiguous, output for the job. Unfortunately, this may not be sufficient to diagnose the problem.

    You already know that the problem does not originate in your code because your code has already proved successful during previous runs. But don't despair... you have just stumbled upon an ideal use for strace!

    strace can be used to monitor MPI applications by means of an intermediary script, such as the following shell script:

       #!/bin/bash
    
       unset STRACE_SUFFIX
       STRACE_SUFFIX=$HOSTNAME.$MPIRUN_RANK
    
       exec strace -tt "$@" 2> strace.${STRACE_SUFFIX}
    

    In this article, this script is referred to as mpi_strace. To use mpi_strace, include it on the mpirun line of your PBS script. For example:

       mpirun -np 4 ./mpi_strace ./a.out
    

    The script distinguishes each rank of an MPI job by referencing the $MPIRUN_RANK environment variable, set by the MPI environment. Depending on which MPI stack you are using there might be a different environment variable set (or none at all). MVAPICH (used on midnight) and MPICH (used on nelchina) both use the $MPIRUN_RANK environment variable for each task.

    Each core involved in the execution of the job will have a unique rank identifier associated with it. The script utilizes these unique rank identifiers to produce separate strace output files for each rank of the MPI job, named after the rank that produced them. The mpirun example above, which uses four cores, might produce the following files (depending on the node it runs on):

      strace.mt003.0
      strace.mt003.1
      strace.mt003.2
      strace.mt003.3
    

    In the script, "$@" holds the original command passed to the rank, including all arguments. This allows the original command issued by mpirun to pass through mpi_strace unchanged. Thus, mpi_strace is essentially transparent to mpirun. strace output is written to stderr, so 2> attempts to redirect strace output to a file while leaving the original command's output, written to stdout, unchanged. One unfortunate consequence of this approach is that if the original command writes anything to stderr, it will get written to the strace output file, not to the PBS output file.

    Note that the node host names are included in the strace output file names. If your job spans multiple nodes, such as two four-core nodes, your output files might be named:

      strace.mt003.0
      strace.mt003.1
      strace.mt003.2
      strace.mt003.3
      strace.mt004.4
      strace.mt004.5
      strace.mt004.6
      strace.mt004.7
    

    This is quite beneficial, as the MPI errors that accompany mysterious job behavior are often as cryptic as this:

      Some rank on 'mt003' exited without finalize.
    

    In this case, your investigation should most certainly start by examining the last few lines of each of the strace.mt003.* files. I recommend using the tail command to print the last 20 lines of each strace file for the node:

      tail -n 20 strace.mt003.*
    

    Also included in the above mpi_strace script is an option supplied to the strace command itself. The -tt option prefixes each line of strace output with the time of day, including microseconds. This can be useful to pinpoint oddities in your code's execution.

    Additionally, if you would like to know how much time is spent in each kernel interaction, you can use the -T strace option, which appends the total amount of time spent in the system call for each line. However, if you are concerned about system bottlenecks as they apply to your code, the -c strace option may be even more useful. The -c option generates a summary of system calls, with timing information, after the execution of the original command has completed.

    For example, the output of the -c strace option may look similar to this:

    % time     seconds  usecs/call     calls    errors syscall
    ------ ----------- ----------- --------- --------- ----------------
     61.40    0.536680        6793        79           write
     25.11    0.219472        1814       121           read
     12.38    0.108171         778       139       107 open
      0.50    0.004341          96        45           ioctl
      0.16    0.001407         469         3           mlock
      0.15    0.001282          85        15           munmap
    ...
    ...
    ------ ----------- ----------- --------- --------- ----------------
    100.00    0.874034                   700       126 total
    

    This summary output was generated by running the same dataProcessor program used for Part I of this article. In this particular example, I used dataProcessor to process a 617 MB file to and from $WRKDIR on midnight. Since strace tracks only the interactions between a process and the kernel, all time percentages are relative to the total time spent in system calls as tracked by strace, not accounting for CPU time. Another downside of the -c option is that it replaces all of the typical strace output with this short summary histogram.

    In all honesty, strace is my last resort tool. When I have run out of ideas, whether I think it will help or not, I run strace. This usually proves fruitful. If strace does not give me the outright solution to the problem, it helps me brainstorm. Suspicious strace output is also incredibly valuable information to include in help requests to User Support.

     

    Fall 2007: ARSC User Training

    This fall, ARSC will once again be providing training in conjunction with Physics 693- Core Skills for Computational Science. The Core Skills class is taught jointly by ARSC and the UAF Physics department and provides an introduction to basic skills needed to work in a high performance computing environment. Individual lectures are open to ARSC users and will be held Tuesday and Thursday in West Ridge Research Building room 009, except where noted.

    Lectures Include:

     September 11   Introduction to Unix
     September 13   Introduction to Linux Systems
     September 18   Data Management / Unix Scripting
     September 20   Viz 1: Integrated Data Viewer, Part 1
     September 25   Introduction to Fortran Part 1
     September 27   Introduction to Fortran Part 2
     October 2      IBM Introduction, Compilers and Makefiles
     October 4      IBM Loadleveler
     October 9      Viz 2: Integrated Data Viewer, Part 2
     October 11     Midnight SUN Introduction, Compilers & PBS (WRRB 010)
     October 16     Performance Programming, Part 1    
     October 18     Performance Programming, Part 2
     October 23     Viz 3: Importing Data and Graphics Formats
     October 25     Viz 4: Animation 101
     

    For a complete schedule see:

       http://people.arsc.edu/~cskills/schedule.shtml

     

    What Like a Pirate?

    September 19th is "International Talk Like a Pirate Day." The web site does have instructions to aid the land-locked:

       http://www.yarr.org.uk/

    Seems a reasonable excuse ter make a fool a yerself in public this month.

     

    Quick-Tip Q & A

    
    A:[[ Let's help out those university students heading to off-campus
      [[ housing.  What's your tip for impoverished students?
    
    # 
    # Jim Williams
    # 
    If you have roommates, don't put the phone in your name unless you
    turn off long distance calling....
    
    
    # 
    # Derek Bastille
    # 
    My big tip is:  milk crates.  Yep, milk crates are your best buddy
    as a student.  They can be used for shelves, storage, a TV stand,
    computer desk, bicycle basket and many other things (I've still got
    several from my old student days ;-)
    
    ... ahh milk crates ...
    
    
    # 
    # Greg Newby
    # 
    Resist the temptation to use a credit card unless you pay it off
    every month.  Since college is a time many people get loans to pay
    for their education, credit cards sometimes appear like another loan
    towards academic and career success.  However, the very high interest
    rate and deluge of credit card applications for students can leave you
    a slave to monthly payments, without any more money in your pocket.
    Perhaps even less money, since so much will go to making payments.
    
    Instead, avoid debt other than needed student loans.  Any sort of loan
    that you have to make payments on while still in school -- especially
    credit cards -- will necessitate additional out-of-pocket expenses
    while in school, to pay them off.
    
    
    # 
    # Rich Griswold
    # 
    I recently attended one of Dave Ramsey's Financial Peace University
    seminars, and I learned a lot about managing money that I wish I had
    known years ago when I was still an impoverished student.  Even if
    you aren't making much money, he has great tips for making the money
    you have work for you.  If you can't attend a seminar, listen to his
    program on the radio or check out one of his books, such as "Financial
    Peace Revisited".
    
    
    #
    # Suzanne Noll
    #
    A HUGE bag of rice!
    
    And another tip: do NOT boil hot dogs in Boone's Farm..... 
    
    
    #
    # Don Morton
    #
    Best bargain in any town - Macaroni & Cheese and Bisquick.  You can
    live for ages off that stuff at very low prices.
    
    Also, y'all probably have rules about printing this sort of thing
    in the ARSC Newsletter, but I should note that in ANY of this stuff,
    you can use beer rather than water for cooking, and it's very good,
    especially bisquick.
    
    
    # 
    # Nathan Bills
    # 
    Top Ramen is still 8 packages for a dollar :)
    
    
    # 
    # Patrick Webb
    # 
    Note that by getting some frozen (or fresh) vegetables and throwing
    them into ramen turns a ultra-cheapo and nutritionally dubious meal
    into something much more satisfying and tasty with not much extra
    cost. If you're feeling really adventurous, cook up some chicken
    beforehand and throw that in. Ramen makes an excellent noodle soup base.
    
    
    # 
    # John Chappelow
    # 
    When I first moved to Fairbanks from UVermont (to do my PhD) I had
    the idea I could exist as a grad student without a car (like I could
    "back East"). Bad plan. Get a car. Or a Space-suit.
    
    --
    
    Seriously, the best advice I can give an incoming grad student (here
    or elsewhere) is to (1) take enough loan to survive a year; (2) once
    there, ASK FOR ADVICE from grad students that're already there; (3)
    learn how to raid buffet tables, early and often; (4) do some serious
    research before you rent an apartment or cabin--some places are so
    disgusting (or dangerous) they're just not worth it. You might contact
    current grad students in your department ahead of moving--some of
    them might be willing to "take you in" for a week or two while you
    look around.
    
    
    # 
    # Ed Kornkven
    #
    (1) If you elect to put up a dartboard, be advised that the accuracy
        of the players goes down the farther one has to throw the darts,
        and the number of doorways the dart has to pass through.
    (2) Your share of the damages at the end of the year will depend
        on how many people shared the apartment, not on what percentage
        of the darts you actually threw.
    (3) Most people aren't very good at darts.
    
    
    # 
    # Daniel Pringle
    #
    Add veggies to ramen. Ramen is cheap, but scurvy isn't. 
    
    --
    
    Student loans bite, but using cash rather than plastic helps. Take
    out your cash for the week, and when its done, its done.
    
    
    #
    # Editor:
    #
    Crack an egg into "ramen with veggies," and you have egg-drop soup.
    Double the recipe, and you have a meal.
    
    
    # 
    # Martin King
    #
    Some people solved this problem by lateral thinking---they stopped
    being students and later became some of the richest and most successful
    people on Earth.  Think Bill Gates, Steve Jobs, Tiger Woods, Richard
    Branson... In any case, you don't necessarily need a degree in order
    to have a good life.
    
    
    
    Q: i have a whole bunch of files with uppercase letters in the names.
       unfortunately, i do not care for uppercase letters.  could you please
       tell me how to rename the files in an automated fashion such that
       there are no uppercase letters in my filenames?
    
       here are some of the files i have to rename along with the name i 
       would like to use:
       
         EARTH001.out  is renamed to earth001.out
         Wind001.out  is renamed to wind001.out
         wATER001.out  is renamed to water001.out
         fiRE001.out  is renamed to fire001.out 
    
       i have no idea who came up with these filenames, but i do not like
       them!
    
    
    

    [[ Answers, Questions, and Tips Graciously Accepted ]]

     

     


    Current Editors:
    Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
    Craig Stephenson ARSC User Consultant ph: 907-450-8653
    Arctic Region Supercomputing Center
    University of Alaska Fairbanks
    PO Box 756020
    Fairbanks AK 99775-6020
    Contact:
    Send comments and questions to the current editors using this Contact Form.
    E-mail Subscriptions: Archives:

     

    Newsletter Index Quick-Tip Index Search Newsletters

     

    Arctic Region Supercomputing Center
    PO Box 756020, Fairbanks, AK 99775 | voice: 907-474-6935 | email:

    home | search | about | support | news | science | resources