Previous: Introduction Up: Synchronization Next: Single-Assignment Arguments and Return Values
Consider the following code:
{ sync int a,b,c,d; par { a = b+c; b = 2; c = 3; d = a+c; } //at this point, a=5, b=2, c=3, d=8 }
The data dependencies in this calculation control the flow
of control. The first thread of execution, that defines the
single-assignment integer a, cannot proceed until both
b and c have been defined. The second and third
threads of control can proceed immediately with the definition
of b and c respectively. Once a and c
have been defined, then the fourth thread of control can define
d. Notice that in this
example, d could have been a mutable integer, since
it is not shared between any threads. The flow of control
for this code is schematically represented in
Figure .
As another example, consider the problem of calculating
all the powers of 2 from 0 to N-1. We use the fact that
the power of 2 can be calculated as
.
{ sync int P[N]; P[0] = 1; parfor (int i=1; i<N; i++) P[i] = 2*P[i-1]; }
Recall that we do not know in what order or in what interleaving
the threads of execution created by a parfor statement
will be executed. The semantics of single-assignment variables,
however, guarantee that a value will not be assigned to P[i]
until a value has been assigned to P[i-1]. The data
dependencies for this program are represented in Figure .
Notice how the strict linear data dependency of this example
constrains the execution to essentially a sequential one.
It is also worthwhile mentioning that this example can be simply rewritten as a correct sequential program by replacing the parfor statement with a for statement. Of course, this need not be the case in general. For example, if the bounds for the loop are reversed, so that i begins with a value N-1 and is decremented to 1, the corresponding sequential program would no longer be correct. This correspondence between sequential and parallel programs suggests methods of systematic parallelization of certain kinds of sequential code structures. It also suggests a deterministic debugging methodology for parallel CC++ programs.
With a slight modification to the previous example, the amount of parallelism possible can be dramatically improved. Consider:
sync int P[N]; P[0] = 1; P[1] = 2; parfor (int i=2; i<N; i=i+2) par { P[i] = P[i/2] * P[i/2]; P[i+1] = P[i/2] * P[i/2] * 2; }
This program makes use of the fact that the
power of 2 can be calculated as
when
is even, and
when
is odd. This modifies the data dependencies
in the computation from a linear structure (as seen
in Figure
) to a tree structure,
represented in Figure
.
Again notice that this program can be simply rewritten as a correct sequential program by replacing the parfor statement with a for statement, and replacing the parallel block with a sequential one.