Wednesday, March 9, 2011

WoDet 2011, ParaSail, and back to the future with Value Semantics

This past Sunday WoDet 2011, the 2nd Workshop on Determinism and Correctness in Parallel Programming was held in Newport Beach, CA.  It was a very interesting event, with a number of excellent talks and a lively debate at the end on the topic of whether current programming models are sufficient given the right tools, etc., or whether we need fundamentally new models to support programming going forward in a multi-core world. 

After listening to the various talks at the workshop, and while thinking about the debate topic, it seemed that ParaSail aligned nicely with an underlying thread in the research community moving toward, or perhaps returning to, what might best be called value semantics.  A number of the talks described one variant or another of multi-version concurrency control, which at its heart involves giving different threads their own private copy/version/snapshot of an object, and then dealing with merging any updates at specific later points, possibly rolling back one of the threads if no meaningful merge is possible.  This is in contrast to an approach where multiple threads share access to a single object, and use some kind of locking or "careful" lock-free operations to make direct updates to the shared object.  The multi-version approach results in fewer interactions between the threads, fewer synchronization points, and hence generally higher throughput in a highly parallel world. 

Looked at another way, the multi-version approach means that objects are being passed around by value rather than by reference, and updating the original object is done after all the new "values" have been generated independently and then appropriately merged (perhaps using some kind of parallel merging process).  This isn't going all the way to pure functional programming, because updates can eventually happen, but the bulk of the processing is performed on object values, with no underlying sharing.

If we go all the way back to the original versions of BASIC, a la Kemeny and Kurtz or Microsoft CP/M BASIC, pointers didn't exist, but there were strings, and you could create them and concatenate them and assign them, and you never by mistake overwrote memory because you concatenated into a too-short buffer, nor did you ever have two variables unintentionally referring to the "same" string in memory, such that changing one would unintentionally alter what you had in the other (even though the implementation might choose to share the space until one of them was altered). 

These kinds of value-semantics strings still exist in many scripting languages, but they have essentially disappeared from compiled languages in the years since BASIC was widely used (and compiled, in some cases).  In almost all modern compiled languages, strings are manipulated by reference, though in some languages they are immutable which to some extent skirts the issue of by-value vs. by-reference semantics.  And certainly when you get to larger, more abstract objects, by-reference semantics rules, with only the simplest of "struct" objects being manipulated by value.  And even when declared objects have value semantics, parameters generally revert to by-reference semantics, again introducing the possibility of unintended aliasing between two distinctly named objects.  (Or if by-value semantics are available for parameter passing (as in C++), it can result in "slicing" of the object, so that it is truncated from its original run-time type to the compile-time type of the formal parameter.)

Unintended aliasing due to by-reference semantics can cause nasty (but still reproducible) bugs in any sequential language that uses by-reference semantics, but in a parallel language, such bugs can become race conditions, which can result in hard-to-reproduce, timing-dependent bugs.

In any case, the net effect of the various ParaSail rules that restrict aliasing, eliminate reassignable pointers, disallow global variables, etc. is that ParaSail provides value semantics for all non-concurrent objects, of arbitrarily complex types.  Note that this does not imply that objects need to be copied when passed as a parameter.  Rather the ParaSail rules allow non-concurrent objects to be passed by reference or by copy, with the same results. This allows efficient by-reference parameter passing when the call stays within the same address space, but allows transferring the object between address spaces, such as might be needed by a call between a CPU and a GPU.  This essentially value semantics eliminates a large class of common bugs, and makes it straightforward for the compiler to detect and disallow at compile-time any possible race conditions.

As implied by the growing interest in multi-version concurrency control in the research community, a return to the world of value semantics might be a good answer for the multi-core challenges.


  1. One could say that the end point of multiversion value semantics, in terms of numerical algorithms, are those "asynchronous" or "chaotic" iterations that people considered a while back. In these algorithms, parallel processors share state and pass messages in a loosely coupled way, which is guaranteed to converge eventually as long as some messages get through. Different processors see different versions of "neighboring" processors' state, depending on when the messages get through.

    The trouble is, existing parallel programming environments aren't good for implementing such algorithms, where messages might fail to get through, etc. Also, we linear algebra folks haven't thought much in those terms, so we keep demanding global synchronization and tightly coupled communication, where perhaps less tightly coupled communication (or at least less frequent communication, which is my interest) would do.

    How would you sell multiversion value semantics outside of, say, the community of social networking developers (since it doesn't matter much in that world whether Alice and Bob see updates from Carla within a second of each other)?

  2. Using multi-version/value semantics doesn't imply non-determinism. In fact, it is in some ways seen as a mechanism to achieve determinism without overly constraining the relative speeds of processing between the forking off of a new value and the later merging back in. This is in part because the more frequent is the interaction between threads, the more difficult it is to achieve determinism without a lot of time wasted by threads waiting for one another to ensure that their updates happen in a predetermined order.

    I don't know that these multi-version approaches necessarily are designed to handle lost messages. That would almost certainly seem to require a quite different algorithm, as you suggest, which is designed to converge "eventually" even with the loss of some proportion of the messages.

    So yes the multi-version approach allows a looser coupling between threads, but I think they tend to presume reliable communication channels.

  3. Thanks for the clarification -- while multiversion/value semantics don't imply nondeterminism, they do fit together nicely with nondeterministic algorithms. I can do things in my world about the cost of synchronization, by rearranging computations to synchronize less. What worries me is the cost of message reliability, especially for very large parallel machines. So perhaps the problem I have to solve is different than the problem that multiversion/value semantics wants to solve: the latter is an effort to relax synchronization for algorithms that naturally need to synchronize a lot. Would you agree?

  4. Yes, I think it is reasonable to see multiversion/value semantics as helping to reduce the frequency of synchronization. If desired, I think you have to explicitly handle lost messages in your algorithm, with time-outs, etc. For what it is worth, ParaSail has a nice way of supporting timeouts, allowing the programmer to specify distinct actions to be taken on timeout vs. normal completion before the timer expires.

  5. Timeouts seem like the right approach for the local communication resulting from things like applying a sparse matrix to a vector. For global communication (e.g., reductions), it seems like the timeouts should hide inside the reduction. Anyway, thanks for the note :-)