Structuring

Previous: Introduction Up: Structured Parallel Blocks: par Next: Sharing Data

Structuring

The key feature of a parallel block is the structuring it provides to parallel code. This structuring is a result of the implicit barrier defined by a parallel block. As discussed above, a parallel block is defined to terminate only when all statements inside the block have terminated. Thus, the closing brace of a parallel block represents a barrier that all parallel threads of control within the block must reach, before the block is terminated.

To illustrate, consider:


{
  par  { 
    a = f(2);
    b = f(31);
    c = g(2,6.5);
  }
  sum = a+b+c; 
}

Schematically, the flow of control is represented in Figure . Notice that the individual threads that assign values to a, b, and c may terminate in any order, at various times. The block, however, is not terminated until all three have completed. Because the outer block is sequential, only when the parallel block terminates can the statement assigning a value to sum execute, as you would expect from knowing C.

Similarly, consider the following example:


{
  int min, max; 
  par  { 
    min = find_min(A); 
    max = find_max(A); 
  } 
  for (i=min; i<=max; i++) 
    compute(i); 
}

Again, at the end of the parallel block, we know that both findmin() and findmax() functions have completed and their return values have been assigned to min and max respectively.

The same principal applies when a parallel block is part of an enclosing, perhaps iterating, structure. Consider the following example:


{
  int pow[N]; 
  pow[0] = 1; 
  pow[1] = 2;
  for (int i=2; i<N-1; i=i+2)
    par  { 
      pow[i]   = pow[i/2]*pow[i/2]; 
      pow[i+1] = pow[i/2]*pow[i/2]*2;
    }
}

This concept of an implicit barrier is an extremely powerful and expressive one. Many sequential C or C++ programs can be conveniently transformed into parallel programs with judicious use of the parallel block. The following example finds the minimum element in a global array. The technique used is that of divide and conquer, where the minimum is found as the smaller of the minimum of the first half and the minimum of the second half of the array.


int A[N];
int min_element (int i, int j)
{
  if (i==j) return A[i]; 
  else  {
    int small1, small2; 
    int middle = (i+j)/2;
    par  { 
      small1 = min_element(i,middle); 
      small2 = min_element(middle+1,j);
    } 
    if (small1<small2) return small1; 
      else             return small2; 
  } 
}

Notice that if the keyword par is removed, the resulting program is a correct solution (albeit a sequential one) to the problem of finding the minimum element. The parallel block allows the independent branches of the recursion to proceed in parallel. The implicit barrier of the parallel block means that the values of small1 and small2 have been evaluated and assigned before the comparison between the two is made. The flow of control in this program is illustrated in Figure .

paolo@cs.caltech.edu