ARSC HPC Users' Newsletter 372, October 19, 2007

Fall Training... Parallel Programming Section, Just Ahead

As previously announced, ARSC is providing user training in conjunction with Physics 693--"Core Skills for Computational Science." Sessions are open to all interested users and prospective users.

The following important series of lectures will commence on October 30th:

30-Oct Parallel Shared Memory Programming, Part 1
 1-Nov Parallel Shared Memory Programming, Part 2
 6-Nov Parallel Shared Memory Programming, Part 3
 8-Nov Parallel Shared Memory Programming (Example)
15-Nov Parallel Distributed Memory Programming, Part 1
20-Nov Parallel Distributed Memory Programming, Part 2
27-Nov Parallel Distributed Memory Programming, Part 3
29-Nov Parallel Distributed Memory Programming, Part 4
 4-Dec Parallel Distributed Memory Programming (Example)

Permission Fix Script

[ by Anton Kulchitsky ]

Suppose you work in a team on the same project. You all need access to some common directory with your files and data. The directory structure can be pretty complex, such as that we have for the Alaska Smoke Distribution project. You all need to make changes into the directory. Can you manage the files and configure the user accounts such that:

  1. Every user from the group can change and add files to the shared directory;
  2. The configuration doesn't affect files in any other directories the user has access to;
  3. Safe practice and all system security policies are followed.

Actually, working in a team can be a permissions nightmare. It is evident that everybody in a team should be a member of one group. Let us name it OUR_TEAM_GROUP.

We can't expect that this group is primary for every group member. Thus, new files or directories will belong to users' primary groups, not to OUR_TEAM_GROUP.

It seems that this particular problem can be easily fixed. Indeed, let us change all group ownership for every file and directory to OUR_TEAM_GROUP and set SGID for every subdirectory.

The SGID bit for a directory DIR can be set by this command:

chmod g+s DIR

If it is set, every new file or directory created in DIR will belong to the same group as DIR. (Be careful that you only set the SGID bit on directories, not files! Otherwise, it's a security violation.)

Does this solve the problem? New files and directories created there will indeed belong to the group, OUR_TEAM_GROUP. However, if you move something to DIR it /will not/ belong to OUR_TEAM_GROUP in general because the Unix "mv" command preserves ownership. You will need to use cp/rm instead.

Unfortunately, there are other problems, even if you consistently avoid "mv." Another problem is maintaining the desired group permissions. You need every new file to have the same permissions for the group as for yourself. However, by default, your umask is defined as 077 which means files when created are not visible, executable, readable, or writable for either your group or others. There is a good reason for that. A umask change would affect every new file/directory you created even if they are in your HOME or WORKDIR directories, and not in the shared directory. Changing your umask to 007 will violate the second condition we mentioned at the beginning and it may potentially be a security issue.

Thus, you need to change permissions explicitly every time you create or copy new files or directories. Our experience shows that it is easier to forget some files than to remember all of them. Even the easy solution of running "chmod" recursively on all the files, whether they need it or not, is usually forgotten. As noted in the Quick-Tip of issue #147 (/arsc/support/news/t3enews/t3enews147/index.xml#qt), if you do perform such a "chmod" the best alternative is:

chmod -R g+rX *

(note the capital "X") which makes ALL files group readable and all files which started off as user executable, group executable as well.

We're getting close to a solution, but suppose there is some process (your model as an example) that is run in batch mode when you are not logged in. This program will output some information in new files. (Otherwise, there is usually no reason to run it.) You will not be able to fix the permissions of those files unless you log in again.

You may see that this is a real nightmare. Some day you will find a lot of e-mails from your group mates all complaining that you forgot to fix permissions. You may find yourself sending similar e-mails. Do you think this is good team work? Maybe yes. However, it is challenging.

*What to do then?*

I can see 3 possible approaches to this problem.

The first solution is not really viable, because it's a violation of security policies at ARSC and hopefully every other computer center, and that's to have a shared user account for the group members.

A second approach is to have a copy of common directories for each user. Each user would work only in their working copies pretty much like in a centralized version control systems. Then they would synchronize with their common centralized directory. This not easy to do and hard to imagine that even a distributed version control system could help because you need to synchronize not only programs but also binaries and possibly huge program outputs. It is possible, however, that such systems already exists somewhere.

The third solution is a bit too simple for the problem but can tremendously improve things. This is a script that automatically does the following:

  • Sets group permissions on files/directories to match the user permissions.
  • Sets the SGID bit for all directories (so any new files will belong to the same group), it also will drop SUID for directories if there are any set, because this would be a security violation.
  • Drops any files which are SGID/SUID because this would also be a security violation.
  • Change group ownership on files to the common group

This script should be called from inside scripts like .bash_logout (for bash, for other shell see documentation) that are executed every time users logout. It also should be run after every program that generates files, automatically if necessary, for example call it at the end of your PBS script. In this case all permissions will be set up properly almost all the time.

This is the solution we use, and our experience shows that this solves most permissions problems without much effort or disturbing our work in HOME, WORKDIR, and other groups.

--

Please, feel free to download our script, and modify it for your own use: http://www.arsc.edu/~kulchits/scripts/pfix.py. It does more than described above. Please, if you run it, first use the -h option to see different options you can use.

Also, I'd like to hear how other groups have solved these problems.

Quick-Tip Q & A


A:[[ I have a python script that I run on a few different machines.  
  [[ On one machine python is in /usr/local/bin, on a different machine
  [[ it's located in /usr/bin.  So I end up with two versions of the 
  [[ script.  One that starts with: 
  [[ 
  [[ #!/usr/local/bin/python
  [[ 
  [[ and another that starts with 
  [[ 
  [[ #!/usr/bin/python
  [[ 
  [[ Every time I switch between machines I mess up at least one run
  [[ because I forget to the change the "#!" path.  Is there a way I can
  [[ avoid having to make this change?


# 
# Thanks to Chris Swingley, Lorin Hochstein, Martin Luthi, Kevin Thomas, 
# Rich Griswold, and Nathan Prewitt, all of whom replied with the 
# /usr/bin/env command.  Here are two of the replies:
# 

#
# Chris Swingley
#

I start all my Python scripts with:

  #! /usr/bin/env python

That way, the system looks for python and runs the script in the python
"environment", wherever it might be found.  'env' is likely to be in
/usr/bin on any Unix system.


#
# Sean Ziegeler
#

Use:

  #!/usr/bin/env python

but that isn't magic.  It can only run python if it is in the PATH
environment variable.  On most systems /usr/bin & /usr/local/bin are
in PATH, but if not, you can configure your .login/.profile/etc. files
to make sure they are.


#
# Ryan Czerwiec provides another alternative...
#

This is a bit brute force, but it's the type of solution I usually use
for such a problem.  Cut the #! line from the beginning of your script.
Then create an alias that looks something like this:

  alias runpythonscript 'echo \#\!`which python` > pythonscripttemp ; cat
  ~/pythonscript >>! pythonscripttemp ; chmod u+x pythonscripttemp ;
  ./pythonscripttemp ; \rm pythonscripttemp'




Q: I have a code written in C++ and would like to append a floating
   point value to the end of a string.  It's pretty easy to do this
   in C using with a character array, e.g.:

     #include <stdio.h>
     #include <stdlib.h>

     int main()
     {
         char buf[1024];
         float v=1.23;
         sprintf(buf,"Val= %.2f\n", v);
         printf("%s", buf);
     }


   But, as I said, I need to do this in C++.  Is there some slick C++
   way to do this?  The following doesn't work!

     #include <string>
     #include <iostream>

     int main()
     {
         std::string buf;
         float val=1.23;
         buf="Val= " + val;
         std::cout << buf << std::endl;
     }

[[ Answers, Questions, and Tips Graciously Accepted ]]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top