Previous: Introduction Up: Structured Parallel Blocks: par Next: Sharing Data
The key feature of a parallel block is the structuring it provides to parallel code. This structuring is a result of the implicit barrier defined by a parallel block. As discussed above, a parallel block is defined to terminate only when all statements inside the block have terminated. Thus, the closing brace of a parallel block represents a barrier that all parallel threads of control within the block must reach, before the block is terminated.
To illustrate, consider:
{ par { a = f(2); b = f(31); c = g(2,6.5); } sum = a+b+c; }
Schematically, the flow of control is represented in
Figure .
Notice that the individual threads that assign values to
a, b, and c may terminate in any order, at various times. The
block, however, is not terminated until all three have completed.
Because the outer block is sequential,
only when the parallel block
terminates can the statement assigning a value to sum
execute, as you would expect from knowing C.
Similarly, consider the following example:
{ int min, max; par { min = find_min(A); max = find_max(A); } for (i=min; i<=max; i++) compute(i); }
Again, at the end of the parallel block, we know that both findmin() and findmax() functions have completed and their return values have been assigned to min and max respectively.
The same principal applies when a parallel block is part of an enclosing, perhaps iterating, structure. Consider the following example:
{ int pow[N]; pow[0] = 1; pow[1] = 2; for (int i=2; i<N-1; i=i+2) par { pow[i] = pow[i/2]*pow[i/2]; pow[i+1] = pow[i/2]*pow[i/2]*2; } }
This concept of an implicit barrier is an extremely powerful and expressive one. Many sequential C or C++ programs can be conveniently transformed into parallel programs with judicious use of the parallel block. The following example finds the minimum element in a global array. The technique used is that of divide and conquer, where the minimum is found as the smaller of the minimum of the first half and the minimum of the second half of the array.
int A[N]; int min_element (int i, int j) { if (i==j) return A[i]; else { int small1, small2; int middle = (i+j)/2; par { small1 = min_element(i,middle); small2 = min_element(middle+1,j); } if (small1<small2) return small1; else return small2; } }
Notice that if the keyword par is removed, the resulting
program is a correct solution (albeit a sequential one) to the
problem of finding the minimum element. The parallel block allows
the independent branches of the recursion to proceed in parallel.
The implicit barrier of the parallel block means that the values
of small1 and small2 have been evaluated and assigned
before the comparison between the two is made. The flow of control
in this program is illustrated in Figure .