ARSC T3D Users' Newsletter 42, June 30, 1995

More Sorting on the T3D (and Y-MP)

In the last newsletter, I showed some timings of simple sorting functions on the T3D. I mentioned that the only sorting routine in the MPP Craylibs was "qsort". This is a very general sorting routine as described by the man page on Denali:


  > QSORT(3C)                   Cray Research, Inc.                 SR-2080 8.0
  > 
  > NAME
  >      qsort - Performs sort
  > 
  > SYNOPSIS
  >      #include <stdlib.h>
  > 
  >      void qsort (void *base, size_t nmemb, size_t size,
  >      int (*compar)(const void *, const void *));
  > 
  > IMPLEMENTATION
  >      All Cray Research systems
  > 
  > STANDARDS
  >      ISO/ANSI
  > 
  > DESCRIPTION
  >      The qsort function sorts an array of nmemb objects, the initial
  >      element of which is pointed to by base.  The size of each object is
  >      specified by size.
  > 
  >      The contents of the array are sorted into ascending order according to
  >      a comparison function pointed to by compar, which is called with two
  >      arguments that point to the objects being compared.  The function
  >      returns an integer that is less than, equal to, or greater than 0 if
  >      the first argument is considered to be respectively less than, equal
  >      to, or greater than the second.
  > 
  >      If two elements compare as equal, their order in the sorted array is
  >      unspecified.
  > 
  > NOTES
  >      The comparison function's arguments should be of type void* and should
  >      be cast back to type pointer-to-element within the function.
  > 
  >      The comparison function need not compare every byte, so arbitrary data
  >      may be contained in the elements in addition to the values being
  >      compared.
  > 
  >      The output order of two items that compare as equal is unpredictable.
  > 
  > RETURN VALUES
  >      The qsort function returns no value.
  > 
  > EXAMPLES
  >   #include <stdlib.h>
  > 
  >   struct element {        /* array of elements to be sorted */
  >           int key;        /* key to sort on */
  >           .
  >           .
  >           .
  >   } q[nel];
  > 
  > element *base = &q[0];
  > 
  >   int compar(const void *a, const void *b)
  >                           /* comparison function for qsort() */
  >   {
  >           return (((struct element *)a)->key - ((struct element *)b)->key);
  >   }
  > 
  >   main() {
  >            .
  >            .
  >            .
  >            qsort(base, nel, sizeof(*base), compar);
  >            .
  >            .
  >            .
  >   }
  > 
  > SEE ALSO
  >      bsearch(3), lsearch(3)
  > 
  > USMID @(#)man/man3/qsort.3c     80.16   02/26/94 11:07:39
Although it is general enough to sort on a key buried in a struct, to compare it to the sort routines of last week's newsletter we need only a very simple comparison function, namely:

  int compar( int *a, int *b )
  {
  /*
    printf( " a= %d\n", *a ); /* by displaying the comparisons made by qsort  */
    printf( " b= %d\n", *b ); /* can you guess the algorithm that qsort uses? */
  */
    return( *a - *b );
  }
And as expected it is slow compared to the other sort routines where comparisons are done as a subtraction whereas for qsort they are done as a function call and a subtraction. Here is the table of last week extended with the qsort timings:

  A timing comparison of sorting functions on the T3D (seconds)

  Number of insertion  shell  quicksort  Munstock  CrayLibs
  elements    sort      sort     sort      sort   MPP qsort
  to sort
      1    0.000004  0.000010  0.000005  0.000004  0.000041
      2    0.000005  0.000012  0.000006  0.000008  0.000011
      3    0.000006  0.000013  0.000006  0.000008  0.000032
      4    0.000006  0.000017  0.000007  0.000011  0.000042
      5    0.000007  0.000018  0.000008  0.000013  0.000050
     10    0.000009  0.000024  0.000012  0.000021  0.000060
     20    0.000017  0.000043  0.000022  0.000045  0.000129
     30    0.000026  0.000072  0.000032  0.000079  0.000270
     40    0.000041  0.000099  0.000044  0.000123  0.000455
     50    0.000055  0.000133  0.000056  0.000160  0.000651
    100    0.000194  0.000327  0.000122  0.000364  0.000850
    200    0.000645  0.000846  0.000270  0.000873  0.002160
    300    0.001543  0.001358  0.000423  0.001687  0.005545
    400    0.002762  0.001795  0.000595  0.002505  0.008576
    500    0.004387  0.002507  0.000759  0.003231  0.014140
   1000    0.016640  0.005644  0.001750  0.008732  0.018084
   2000    0.075755  0.014613  0.004072  0.026065  0.034495
   3000    0.194306  0.024549  0.007052  0.047113  0.084481
   4000    0.389964  0.034641  0.010076  0.070686  0.133989
   5000    0.611296  0.045975  0.013453  0.095750  0.203639
  10000    2.698027  0.106615  0.032286  0.224263  0.245153
  20000   11.086379  0.242284  0.077115  0.535524  1.563378
  30000   25.146869  0.382456  0.126676  0.892756  1.133322
  40000   44.656612  0.560791  0.180124  1.307390  2.761088
  50000   69.894340  0.719300  0.235710  1.781510  3.075412
For Y-MP users who don't run on the T3D, it is sometimes surprising to hear that a single PE of the T3D is faster than the Y-MP. But simple sorting as implemented in these functions use the typical nonvectorized operations that the T3D performs faster than the Y-MP. The same source code that produced the above table, when run on Denali produces:

  A timing comparison of sorting functions on the Y-MP (seconds)

  Number of insertion  shell  quicksort  Munstock  CrayLibs
  elements    sort      sort     sort      sort   MPP qsort
  to sort
      1    0.000007  0.000010  0.000008  0.000008  0.000011
      2    0.000008  0.000010  0.000009  0.000009  0.000021
      3    0.000009  0.000011  0.000010  0.000011  0.000034
      4    0.000009  0.000013  0.000014  0.000013  0.000042
      5    0.000011  0.000015  0.000015  0.000015  0.000058
     10    0.000019  0.000025  0.000031  0.000028  0.000133
     20    0.000047  0.000048  0.000069  0.000066  0.000331
     30    0.000103  0.000084  0.000113  0.000116  0.000595
     40    0.000161  0.000132  0.000158  0.000161  0.000828
     50    0.000273  0.000171  0.000213  0.000221  0.001081
    100    0.000947  0.000404  0.000497  0.000592  0.002924
    200    0.003472  0.000940  0.001163  0.001300  0.007176
    300    0.007198  0.001599  0.001911  0.002202  0.011182
    400    0.014390  0.002409  0.002688  0.003336  0.018947
    500    0.020910  0.002940  0.003502  0.004344  0.023019
   1000    0.082651  0.006890  0.007783  0.010099  0.044912
   2000    0.326777  0.016185  0.017034  0.022488  0.103020
   3000    0.748382  0.027023  0.027492  0.039105  0.175020
   4000    1.329824  0.036965  0.037951  0.054372  0.270825
   5000    2.071604  0.050595  0.049027  0.071482  0.321830
  10000    8.288131  0.112318  0.105333  0.161358  0.706220
  20000   33.232527  0.257633  0.224597  0.366363  1.438304
  30000   74.260375  0.405592  0.351384  0.577239  2.206202
  40000  132.709529  0.600089  0.485890  0.833731  2.949298
  50000  207.743079  0.780485  0.619806  1.037878  3.813924
Sorting when used with the best algorithm has many vectorizable operations and CRI has implemented such algorithms in the sorting routines in Libsci: ISORTD, SSORTB, ISORTB and ORDERS and their performance is much better.

Spawning T3D Programs from a "nondefault" Directory on the Y-MP

When using a Y-MP program to spawn a T3D program, the pvm_spawn function by default looks in the directory ~/pvm3/bin/CRAY. This is sometimes inconvenient and can be overridden. Enrique Curchitser of Rutgers University sends in the fix:

  > you can run in master/slave mode from
  > any directory. what you need is a file (that i call
  > hostfile) which contains a line like:
  > 
  > denali ep=/u1/uaf/curchits/Research/SFEM/SWE
  > 
  > where ep is the directory where the executable resides.
  > then, when you start the pvm daemon you do it by typing
  > 
  > pvmd3 hostfile&
  > 
  > this lets it know where the executable resides so you
  > do not have to have it in ~/pvm3/bin/CRAY if you
  > don't want to.
I've struggled with this problem myself but always went back to copying the executable to /u1/uaf/ess/pvm3/bin/CRAY. Thanks!

List of Differences Between T3D and Y-MP

The current list of differences between the T3D and the Y-MP is:
  1. Data type sizes are not the same (Newsletter #5)
  2. Uninitialized variables are different (Newsletter #6)
  3. The effect of the -a static compiler switch (Newsletter #7)
  4. There is no GETENV on the T3D (Newsletter #8)
  5. Missing routine SMACH on T3D (Newsletter #9)
  6. Different Arithmetics (Newsletter #9)
  7. Different clock granularities for gettimeofday (Newsletter #11)
  8. Restrictions on record length for direct I/O files (Newsletter #19)
  9. Implied DO loop is not "vectorized" on the T3D (Newsletter #20)
  10. Missing Linpack and Eispack routines in libsci (Newsletter #25)
  11. F90 manual for Y-MP, no manual for T3D (Newsletter #31)
  12. RANF() and its manpage differ between machines (Newsletter #37)
  13. CRAY2IEG is available only on the Y-MP (Newsletter #40)
  14. Missing sort routines on the T3D (Newsletter #41)
I encourage users to e-mail in differences that they have found, so we all can benefit from each other's experience.
Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top