ARSC HPC Users' Newsletter 331, December 16, 2005



Dynamic Linking - Part II

[ Thanks to Jesse Niles, User Services Consultant, ARSC ]

In the first installment of this series we took a look at basic dynamic linking, and now address some more advanced issues. For instance, what if you want your application to have a plug-in feature? How do you load new C++ classes into an application without recompiling it?

The following is a little bit trickier than the previous examples, but it is arguably the most powerful and is much more object-oriented. The first thing to do is declare a plug-in base class:


1 #ifndef PLUGINBASE_H
2 #define PLUGINBASE_H
4 //Forward declaration for typedefs
5 class PluginBase;
7 //Typedefs for function types used by plug-in loader
8 typedef PluginBase *PluginCreationFuncType();
9 typedef void PluginDestructionFuncType(PluginBase *);
11 //Base class for all plug-ins
12 class PluginBase {
13     public:
14         PluginBase() {}
15         virtual ~PluginBase() {}
17         //Pure virtual method to be overridden
18         virtual int someMethod(int, int) = 0;
19 };
21 #endif //PLUGINBASE_H

The plug-in base class shown here only contains one pure virtual method, but it can contain variables and whatever other methods your application might need. The typedefs make the usage of the dynamic loading easier on the eyes and wrists. Next, we'll pretend that our application has already been compiled and released, and we want to add a few new plug-ins for it. The next two files declare and define the new plug-ins:


1 #ifndef NEWPLUGINS_H
2 #define NEWPLUGINS_H
3 #include "pluginbase.h"
6 //Derived plug-in class declarations
7 class NewPlugin : public PluginBase {
8     public:
9         NewPlugin() : PluginBase() {}
10         virtual ~NewPlugin() {}
11         virtual int someMethod(int a, int b);
12 };
14 class NewPlugin2 : public PluginBase {
15     public:
16         NewPlugin2() : PluginBase() {}
17         virtual ~NewPlugin2() {}
18         virtual int someMethod(int a, int b);
19 };
21 #endif //NEWPLUGINS_H

1 #include "newplugins.h"
3 //Overridden method for first new plug-in
4 int NewPlugin::someMethod(int a, int b)
5 {
6     return a*b; 
7 }
9 //Creation function for first new plug-in
10 extern "C" PluginBase * createNewPlugin()
11 {       
12     return new NewPlugin();
13 }
15 //Overridden method for second new plug-in
16 int NewPlugin2::someMethod(int a, int b)
17 {
18     return a+b; 
19 }
21 //Creation function for first new plug-in
22 extern "C" PluginBase * createNewPlugin2()
23 {
24     return new NewPlugin2();
25 }
27 //Destruction function that can actually be used for
28 //all classes derived from type PluginBase
29 //NOTE: This could be contained in a source file with
30 //the base class, but for the example it is included here
31 extern "C" void destroyPlugin(PluginBase *destroyMe)
32 {
33     delete destroyMe;
34 }

The new plug-ins override the pure virtual method in the base class, and they provide unmangled functions for instantiation. Because we are still compiling in C++, the extern "C" really only applies to the function name so we can load it from the .so easier. Lastly, the code for the application that is capable of loading plugins is as follows with the comments describing the overall method:


1 //Standard C++ library includes
2 #include<iostream>
3 #include<vector>
5 //Include file for dl functions
6 #include<dlfcn.h>
8 //Include header for PluginBase and BasicPlugin ONLY
9 #include"shared.h"
11 int main()
12 {
13     //Handle that the dl function use
14     void *handle;
16     //Container for loaded plug-ins
17     std::vector<PluginBase *> plugins;
19     //Vector of strings containing class names of new plug-ins
20     std::vector<std::string> requestedPlugins;
22     //Only one destruction function is needed because
23     //they all share the same base class
24     PluginDestructionFuncType *destroyer;
26     //Request by name the two plug-ins
27     //These strings could come from a configuration file or user input
28     requestedPlugins.push_back("NewPlugin");
29     requestedPlugins.push_back("NewPlugin2");
31     //Error message returned by dlerror()
32     char *error;
34     //Load in symbols from filename located in argv[1]
35     //RTLD_LAZY means the symbols will be resolved when needed
36     handle = dlopen("", RTLD_LAZY);
38     //If handle is null, exit with error message
39     if (!handle)
40     {
41         std::cerr << dlerror() << std::endl;
42         return -1;
43     }
45     dlerror(); //clear error messages, if any
47     for (int i = 0; i < requestedPlugins.size(); i++)
48     {
49         //Load creation function for this plug-in
50         //This follows the arbitrary convention that each
51         //creation function follows the formula: create[classname]
52         PluginCreationFuncType *creator = (PluginCreationFuncType* )(
53             dlsym(handle, ("create" + requestedPlugins[i]).c_str()));
55         //dlsym didn't work, continue with error
56         error = dlerror();
57         if (error)
58         {
59             std::cerr << error << std::endl;
60             continue;
61         }
63         //Call creation function and get a pointer to the
64         //new plug-in instance
65         PluginBase *newPlugin = creator();
67         //Add pointer to the vector of loaded plug-ins
68         plugins.push_back(newPlugin);
69     }
71     //Load destruction function for all plug-ins derived from PluginBase
72     destroyer = (void (*)(PluginBase *))(dlsym(handle, "destroyPlugin"));
74     //dlsym didn't work, exit with error
75     error = dlerror();
76     if (error)
77     {
78         std::cerr << error << std::endl;
79         dlclose(handle);
80         return -1;
81     }
83     //Call methods and output return value, then delete the instance
84     for (int i = 0; i < plugins.size(); i++)
85     {
86         std::cout << plugins[i]->someMethod(23, 91) << std::endl;
87         destroyer(plugins[i]);
88     }
90     //Clean up dangling pointers
91     plugins.clear();
93     //Unload library
94     dlclose(handle);
95     return 0;
96 }

Essentially, the process is:

  1. Get strings for symbol names. Here they are constructed by prepending "create" onto the textual class names. This is my own convention, so really anything can be used as long as it matches the creation functions in the shared object.
  2. Open the shared object.
  3. Iterate through each of the symbol names, loading in the creation functions, calling them to get an instance of each plug-in, and finally add pointers to these instances to the plug-in vector.
  4. Use the plug-ins.
  5. When they are no longer needed, call the destruction function so the destructor for each is called, and then delete the dangling pointers to the newly destroyed plug-ins.
  6. Unload the shared object.

Please note that the error-checking could be more elegant, and some basic C++ "should-do"s were omitted.

The build process is simply:


1 default : all
2 all : pluginloader
4 : newplugins.cpp newplugins.h
5         g++ -g newplugins.cpp -shared -o
7 pluginloader : pluginloader.cpp
8         g++ -g pluginloader.cpp -ldl -o pluginloader 

The output from the application looks like:

snuggles % ./pluginloader

To add additional functionality to this application, you must only derive a class from PluginBase, create a creation function for it, and plop it into a shared object file. From the application, you would then just supply the symbol name for the creation function for the new class, and load it in with the dl functions.

While there are many complications in the realm of dynamic linking (just take a look at the ld man pages on a machine running AIX), this is all I have ever really needed to utilize the seemingly magical shared object paradigm.


Error Checking in Job Scripts

In the past few years I have seen a lot of job scripts. Many of these scripts are quite complicated, however nearly all of them lack any error checking whatsoever. To be honest, I have been guilty of this myself on a number of occasions. But I have decided to turn over a new leaf.

When an executable is run, on the command line or in a script, the shell variable records the return status of the executable. The name of this variable is "$?" for sh, ksh, and bash and "$status" for csh and tcsh.

By convention a non-zero return value indicates that the command exited in error. It's up to you to decide what to do when an error occurs. A script will continue to run in spite of errors unless you explicitly have it exit when an error occurs. If you do not handle an error when it occurs you might never notice it.

Here's an example of a badly behaving script which does no error checking and exits without indicating an error has occurred.

    klondike 1% cat bad.ksh
    # Queue options, etc...
    #run an executable
    #attempt to copy file to non-existent directory
    # cp will exit with a non-zero value.
    cp output /not_here
    #remove the output
    # unfortunately the cp wasn't successful!
    rm output

If we run the script, we do see an error message, but the final status erroneously indicates the script was successful:

    klondike 2% ./bad.ksh 
    UX:cp: ERROR: Cannot create /not_here - Permission denied
    klondike 3% echo $?

If your script writes a lot of output you might miss such error messages mixed in with the other output. Regardless of whether or not you noticed the error, as in this example, you might lose output.

All of the shells available at ARSC define the "OR" operator (i.e. "||"). The "||" operator uses short circuit evaluation (i.e. the command after the "||" will only be executed if the preceding command exits in error).

Below is an improvement to the cp command from the script above.

Basic Error Handling Example:

  sh/ksh/bash version:
    cp output /not_here 

 exit $?

  csh/tcsh version:
    cp output /not_here

 exit $status

Now if the copy fails, we don't lose any data. This is a definite improvement.

If we want to include a helpful message when the exit occurs, the error handling gets a bit more complicated.

This example demonstrates a potential mistake. It sets the exit value of the script to the exit value of the "echo" command (not that of the "cp" command, as desired):

    cp output /not_here 

 echo "Error :" $status && exit $status

We can avoid this by using an intermediate variable. The "&&" ("AND") operator ties everything together. When an "&&" operator is encountered, it executes the command following the operator so long as the preceding command is sucessful.

Improved Error Handling Example:

  sh/ksh/bash version:
    cp output /not_here 

 ev=$? && echo "Error: " $ev && exit $ev

  csh/tcsh version:
    cp output /not_here 

 set ev=$status && echo "Error: " $ev && exit $ev

In an actual script this approach could get cumbersome. We can make it a bit more manageable using aliases for csh and tcsh and functions for sh, ksh and bash.

Error Handling Examples Using Functions and Aliases:

  sh/ksh/bash version:
    function checkError
    # Checks to see if exit value is non-zero and exits
    # if it is.
    # get the exit status.
    ev= $?
    # test the value of ev
    if [[ $ev != 0 ]]; then
        #if non-zero display a message and exit.
        echo "Error: " $ev
        exit $ev
    cp output /not_here 


  csh/tcsh version:
    alias printError 'set ev=$status && echo "Error: " $ev && exit $ev'
    cp output /not_here 


You can also use aliases with sh and ksh, however the alias version failed to work with the version of bash I was using for unknown reasons.

  sh/ksh version:
    alias printError='ev=$? && echo "Error: " $ev && exit $ev'
    cp output /not_here 


Note the aliases above use single quotes (') instead of double quotes ("). This ensure that the variables in the alias are evaluated at the appropriate time -- when the alias is used rather than when it is defined.


Happy Holidays!

Happy holidays, everyone. See you in, hard to believe it, aught six.


Quick-Tip Q & A

A: [[ I would like to build a tar file of all of the files in a directory
   [[ and subdirectories, except for the *.o and *.nc files.  Is there a 
   [[ way to selectively add the files I want to a tar file?

  # Thanks to Jesse Niles of ARSC

  find ./ -not \( -name "*.nc" -o -name "*.o" \) -a -type f -exec tar -rf mytar.tar {} \;

  [ Editors Note: the version of tar used in the example above was GNU
    tar.  Other versions of tar may not error out if the the tar file
    doesn't already exist. ]

Q: Aaarrgghh!  Never mind why, but I stupidly did this:

     chmod -R 777 progs 

   to my "progs" directory, and now everything, the directories, text
   files, image files, object files, etc., are all "executable." (What I
   really wanted was "chmod -R go+rX".) Is there an intelligent way to
   undo this?

[[ Answers, Questions, and Tips Graciously Accepted ]]

Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top