ARSC T3D Users' Newsletter 104, September 13, 1996

Eurekas

If you use "events" to communicate between PEs,"eureka-mode" is much faster than "memory-mode." The price for speed, however, is flexibility.

"Events" are essentially binary flags: they're either ON or OFF , SET or CLEAR . They are shared across all PEs and allow a task on one PE to signal all the other PEs that something has happened.

>From the programmer's point of view, "eureka-mode" events are the easiest to use. At a given time, only one "eureka-mode" event exists, all PEs have access to its status (set or clear), and all PEs must participate in resetting it.

How would you use "eureka-mode" events?

From the CRI documentation:

An eureka is like a barrier. When all of the PEs are searching for something, the one that finds it posts an eureka that is visible to all the other PEs. The posting of the eureka stops the search.

Another point of view:

Imagine 32 people walking down the rows of a corn field, looking for Grandmother's wedding ring. If one person found it, he'd yell, "Eureka!", and they'd all go back for spiced cider.

I wanted to see how fast "eureka-mode" events propagate, and compare these times with those of "memory-mode" events. My test program and results follow. My result times are actually the sum of both a eureka and a barrier propagation, but they are useful for comparison and getting a feel for events.

Some conclusions from the results:

  1. the speed of "eureka-mode" events does not depend on either the number of PEs or the specific PE which triggers the event.
  2. the speed of "memory-mode" events drops as the number of PEs increases, and is highly variable, depending on which PE triggers the event. (I don't give the results here, but I ran the program a number of times on 8 PEs, and this dependence on triggering PE seems to be consistent. For 8 PEs, when PE 0 was the trigger, the total propagation time was always about 7 usecs longer than if one of the other PEs was the trigger.)


cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
      Program eureka_timings

      implicit none 
      integer trigger_PE         ! Which PE will now trigger event
      integer mc(128)            ! Array to store system info 
      integer MY_PE              ! Intrinsic function to get PE number 
      integer mem_event          ! Shared variable for memory-mode event
      real t1                    ! Temporary storage of start times 
      real t2                    ! Temporary storage of end times 
      real delay_start           ! For simulated work, start of spin
      real irtc                  ! Internal function, clock ticks 
      real cp                    ! Clock period in secs
      logical test_event         ! Internal function
      intrinsic MY_PE

cdir$ shared mem_event

      call gethmc (mc)
      cp = mc(7) * 1.0e-12      ! convert picosecs to secs.

c
c     Time event propagation when using eureka-mode events
c
      if (MY_PE() .EQ. N$PES-1) then 
        write (6,1000) "EUREKA-MODE "
        call flush (6)
      endif

      do trigger_PE = 0, N$PES - 1
        call clear_event ()        ! In Eureka mode, all PEs must clear

        call barrier ()  ! Make sure all PEs ready to watch for event
        if (MY_PE() .EQ. trigger_PE) then

          ! Kill .1 secs to simulate some work
          delay_start = irtc ()
5         if (irtc() .LT. delay_start + 0.1 / cp) goto 5

          t1 = irtc ()
          call set_event ()         ! Trigger event
          call barrier ()           ! Wait till all PEs detect event
          t2 = irtc ()

          write (6, 1010) MY_PE(), (t2-t1) * cp * 1e6
          call flush (6)
        else
10        if (.NOT. test_event()) goto 10
          call barrier ()           ! Wait till all PEs detect event
        endif
      enddo  


c
c     Now use memory-mode Events
c


      if (MY_PE() .EQ. N$PES-1) then 
        write (6,1000) "MEMORY-MODE "
        call flush (6)
      endif

      do trigger_PE = 0, N$PES - 1
cdir$ master
        call clear_event (mem_event)   ! In Memory mode, one PE clears
cdir$ end master 

        call barrier ()
        if (MY_PE() .EQ. trigger_PE) then

          delay_start = irtc ()
105       if (irtc() .LT. delay_start + 0.1 / cp) goto 105

          t1 = irtc ()
          call set_event (mem_event)
          call barrier ()
          t2 = irtc ()

          write (6, 1010) MY_PE(), (t2-t1) * cp * 1e6
          call flush (6)
        else
110        if (.NOT. test_event(mem_event)) goto 110
          call barrier ()
        endif
      enddo  

1000  format (a,/,"Event_PE ", " Delay(usecs)")
1010  format (i4, "       ", f6.2)
      end
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc

Output from this program, using 8 PEs, looks like this:


    EUREKA-MODE 
    Event_PE  Delay(usecs)
       0        11.35
       1        10.11
       2         9.79
       3         9.82
       4         9.78
       5        10.15
       6        10.12
       7         9.98
    MEMORY-MODE 
    Event_PE  Delay(usecs)
       0        27.20
       1        20.30
       2        20.14
       3        19.64
       4        19.80
       5        19.75
       6        19.32
       7        21.01

The following table gives output on ARSC's T3D from runs using from 1 to 64 PEs. (Note that other users were active on the T3D at the time of these runs.) The rightmost column gives the triggering PE. The results are microseconds at the triggering PE for the event to propagate, for all PEs to catch it, and for all PEs to hit the barrier, and for the barrier at the triggering PE to release.


    Eureka-Mode                     Memory-Mode
    NPEs:                           NPEs:
    1   2   4    8   16   32   64   1    2    4    8    16   32   64
    === === === ==== ==== ==== ==== ==== ==== ==== ==== ==== ==== =====
 0  6.3 9.0 9.5  9.9  9.3  10. 10.3 13.3  19. 22.3 27.4 30.1   33  37.
 1      9.8 9.8  9.9  9.7 10.4 10.1      18.6 19.4 18.9 22.8 39.7  46.7
 2          9.7 11.7 10.0 10.2 10.2           17.9 20.9 22.8 61.9  63.8
 3          9.9  9.8 10.3 10.0 10.7           18.3 20.5 24.2 71.1  47.3
 4               9.3 10.3 10.0 10.3                19.7 23.7 42.9  54.0
 5              10.1 10.3 10.0 10.0                 19. 23.6 43.1  40.6
 6               9.8 10.1 10.1 11.6                20.1 24.4 38.1  70.1
 7              10.1 11.5 10.3 10.3                20.1 25.5 33.6  53.9
 8                   10.2 11.6 10.2                     23.5 35.2  48.5
 9                    9.9 10.4 10.1                     24.4 41.7  50.0
10                   10.3  9.9 10.4                     27.4 50.3  46.8
11                    9.6 10.1 10.0                     27.5 35.2  48.5
12                    9.9 10.1 10.3                     28.7 61.6  49.7
13                    9.8  9.9   10                     26.4 77.4  71.2
14                    9.9  9.9 10.1                     29.2 41.6  50.4
15                   10.4 10.3 10.3                     27.3 49.8  66.5
16                        10.9 10.3                          27.8 108.3
17                        10.0 10.2                          28.4 101.1
18                        10.0 10.4                          42.8 119.2
19                        10.6  10.                          43.7 179.7
20                         10. 10.0                          37.5 227.3
21                        10.0 11.3                            39 290.2
22                         9.9 10.2                          32.8 304.9
23                        10.1  9.7                          30.4 248.5
24                         9.8 10.4                           47.  66.3
25                        11.6 10.4                          36.6  52.8
26                        10.1 10.1                           37.  41.6
27                        10.0 10.1                          46.2  99.9
28                        10.3 10.0                           38. 129.6
29                        10.3 11.8                          32.5  94.6
30                        10.2 10.2                          35.3 142.1
31                        10.1 10.2                          29.9  51.1
32                             10.0                                39.5
33                             10.4                                43.4
34                              9.9                                69.5
35                             10.2                                54.7
36                              9.9                                53.6
37                             11.1                                57.1
38                             10.4                                39.1
39                             10.0                                40.3
40                             10.0                                53.7
41                             10.5                                67.9
42                             10.3                                62.6
43                              9.7                                40.5
44                             10.5                                74.9
45                             10.2                                60.7
46                             11.4                                37.5
47                             10.5                                58.9
48                             10.3                                81.9
49                              9.9                                86.5
50                             10.2                                71.2
51                             11.0                                81.5
52                             10.2                               171.5
53                              9.9                               152.5
54                             11.3                               168.3
55                             10.7                               110.4
56                             10.4                                46.5
57                              10.                                51.0
58                             10.5                                58.5
59                             10.2                                47.2
60                             10.3                                66.6
61                             11.8                                63.6
62                             10.4                                73.0
63                             10.0                                62.4
========================================================================

Here is some of CRI's documentation on events and eurekas, taken from docview:

Events 6.4

Events are typically used to record the state of a program's execution and to communicate that state to another task. Because they do not set locks, as do the lock routines described in the preceding subsection, they cannot easily be used to enforce serial access of data. They are well suited for work such as signaling other tasks when a certain value has been located in a search procedure. Four library routines perform the event functions.


CALL SET_EVENT
([
event
])

    
CALL CLEAR_EVENT
([
event
])

    
CALL WAIT_EVENT
([
event
])

    result = 
TEST_EVENT
([
event
])

----------------------------------------------------------------------

 
event
      
 A shared integer variable.  If this argument is       


            
 present, the event routines are operating in memory   


            
 mode; if the event argument is omitted, the event is  


            
 in eureka mode.  These two modes are described in     


            
 this subsection and the next subsection.              


 ------------------------------------------------------------------ 


 result     
 A logical value returned by TEST_EVENT.  If .TRUE.,   


            
 the event has been posted.  If .FALSE., the event     


            
 has not been posted.                                  

----------------------------------------------------------------------

SET_EVENT sets, or posts, an event; it declares that an action has been accomplished or a certain point in the program has been reached. You can post an event at any time, whether the state of the event is cleared or already posted. WAIT_EVENT suspends task execution until the specified event occurs. CLEAR_EVENT clears an event. TEST_EVENT returns the state, either set or cleared, of an event.

Multiple memory-mode events can be in progress at the same time. Because event posting and clearing are not atomic operations in memory mode, SET_EVENT and CLEAR_EVENT should be used with caution. That is, write your code so that only a single task will clear the memory-mode event before any task attempts to post the event.

Eureka mode 6.4.1

When an event variable is not passed to the event routines, the routines operate in eureka mode. In eureka mode, the hardware barrier network is used for event communication. When a shared event variable is passed to the event routines, the shared variable is used for event communication. Memory mode is less efficient than eureka mode, but it is considerably more flexible.

Eureka synchronization has several uses, including database searches. Using eureka synchronization, you can stop a database search as soon as any PE finds the data rather than waiting for all of the PEs to exhaust the search.

In eureka mode, all tasks must clear the event before any task can test, wait for, or post the event, but all tasks need not wait for a eureka event to be posted. However, all tasks must once again clear the event before another eureka activity can begin.

Events in eureka mode cannot be posted across barriers.

Programming Environment 2.0 Available at ARSC

[ The following is taken from "news pe2.0" on denali: ]

 

 
 Programming environment 2.0
 
 ===========================
 

 
 Cray Research has combined the compiler and supporting application tools
 
 (CrayTools) and libraries (CrayLibs) into the programming environment
 
 2.0 release to provide an integrated environment.  Programming
 
 environment 2.0 and later products will no longer be installed in the
 
 traditional directory structure.  User programming environment and PATH
 
 variables have to be reconfigured to access products and new features
 
 offered by PE 2.0.
 

 
 At ARSC, The update to programming environment 2.0 has been scheduled
 
 for Sept 24, 1996.  A user test period of two weeks beginning Sept 11,
 
 has been planned.  Users who would like to start testing their code
 
 immediately can do so by executing the following commands after they
 
 login.
 

 
 Please DO NOT include the commands in your .cshrc or .profile files.
 

 
 ************* (for c shell users) **********
 

 
        source /opt/modules/modules/init/csh
 

 
        module load modules PrgEnv
 

 
 ************* (for sh and ksh shell users) **********
 

 
         . /opt/modules/modules/init/ksh
 

 
        module load modules PrgEnv
 

 
 *****************************************************
 

 
 Users should see essentially identical results using the new versions of
 
 these CrayTools and libraries. Please report any problems to
 
 consult@arsc.edu.  We strongly encourage users to make the transition to
 
 programming environment 2.0 as this will be the made the default environment
 
 during a maintenance period the evening of Sept 24, 1996.
 


Quick-Tip Q & A


A: {{ Using "vi," how would you insert a "C" at the beginning of each
      line of text for a block of lines (i.e., comment out some Fortran
      code)?  }}

  # Thanks to several readers who responded, in detail, to this 
  # question.  Here are six solutions (you only need to read one!).

      Input            Action / Explanation
      =====            ================================================
  [1]
                       -Move cursor to last line of block.
      mm               -Mark line with the tag "m" (easiest to type).
                       -Move cursor up to first line of block.
      :.,'ms/^/C       -In ex mode (:), from the current line (.) to (,)
                        the line tagged with "m" ('m), substitute (s/) 
                        the start of the line (^) with (/) a "C" (C).

  [2]    
                       -Move cursor to first line of block.
      ma               -Mark line with the tag "a".
                       -Move cursor to last line of block.
      mb               -Mark line with the tag "b".
      :'a,'bs/^/C      -From tag "a" to tag "b", substitute.

  [3]    
                       -Move cursor to first line of block.
      CTRL-G           -Display current line number (say, 28).
                       -Move cursor to last line of block.
      :28,.s/^/C       -From line 28 to current line, substitute.

  [4]
      :se nu           -Show all line numbers ("set number").
      :28,100s/^/C     -From line 28 to line 100, substitute.
      :se nonu         -Hide line numbers ("set nonumber").

  [5]
                       -Move cursor to first line of block.
                       -Assume last line of block is end-of-file.
      :.,$s/^/C        -From current line (.) to EOF ($), substitute.

  [6]
                       -Assume block is a total of 11 lines long. 
                       -Move cursor to first line of block.
      :.,+10s/^/C      -From current line to current line plus 10, 
                        substitute. 
    

Q: In ftp, can you "more" a remote file -- before you "get" it? How?

[ Answers, questions, and tips graciously accepted. ]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top