Wednesday, February 19, 2014

Worklist Algorithms and Parallel Iterators

ParaSail has some very general parallel iterator constructs.  One of the more unusual involves the use of a continue statement to initiate a new iteration of an outer loop statement, as illustrated in this breadth-first search of a graph:

        var Seen : Array<Atomic<Boolean>, Indexed_By => Node_Id> :=
          [for I in 1 .. |DGraph.G| => Create(#false)];

       *Outer*
        for Next_Set => Root_Set loop  // Start with the initial set of roots
            for N in Next_Set concurrent loop  // Look at each node in set
                if not Value(Seen[N]) then
                    Set_Value(Seen[N], #true);
                    if Is_Desired_Node(DGraph.G[N]) then // Found a node
                        return N;
                    end if;
                    // Continue outer loop with successors of this node
                    continue loop Outer with Next_Set => DGraph.G[N].Succs;
                end if;
            end loop;
        end loop Outer;
        // No node found
        return null;

It has been a bit hard to explain what it means for an inner concurrent loop to continue an outer loop.   More recently, we have developed an alternative explanation.  The basic idea is to appeal to the notion of a worklist algorithm.  Various (sequential) algorithms are organized around the use of a list of work items to be processed, with the overall algorithm continuing until the worklist is empty, or, for a search, until an item of interest is found.  For example, the work-list approach to data flow analysis algorithms is described here:
Here is another worklist-based algorithm, for solving a constraint satisfaction problem using the Arc Consistency approach:
Finally, here is a breadth-first graph search algorithm, which uses a work queue instead of a work list:
So one way of explaining this kind of ParaSail iterator is as a built-in work-list-based iterator, where the loop keeps going until there are no more work items to process, and continue adds a new work-item to the worklist.  This approach makes it clear that such iterators can work even if there is only one physical processor available, and also suggests that the work items are not full threads of control, but rather light-weight representations of some work to be done.  This terminology of work items and work list or work queue also matches well with the terminology of the underlying work stealing scheduling mechanism in use.

We used to refer to such ParaSail's parallel iterators as a bag of threads, but now it seems easier to describe each iterator as having its own list of work items (i.e. a worklist) and the loop keeps going until the worklist is empty, or until we encounter a return or exit (aka break) statement.

One interesting detail at this point is that for a breadth-first search, you want the work list/queue to be processed in a FIFO manner, while for a depth-first search, you want to process the work list/queue as a stack, in a LIFO manner.  The work-stealing scheduler we use does some of both, using FIFO when stealing, and LIFO when the work-items are served by the same core that generated them.  This argues for perhaps giving more control to the programmer when using an explicit continue statement, to be able to specify FIFO or LIFO.  However, it is not entirely clear if that added complexity is worthwhile.

No comments:

Post a Comment