Computer architecture is about to undergo,
if not another revolution, then a vigorous shaking-up. The major chip
manufacturers have, for the time being, simply given up trying to make
processors run faster. Instead, they have recently started shipping ``multicore''
architectures, in which multiple processors (cores) communicate directly
through shared hardware caches, providing increased concurrency instead
of increased clock speed. This course will cover the basics of non-blocking
synchronization in multiprocessors. We will cover the relative power
of synchronization instructions, universal non-blocking constuctins,
as well as specific wait-free, lock-free, and obstruction-free algorithms
and data structures.
Atomicity analysis for concurrent software
Ensuring the correctness of multithreaded programs
is difficult, due to the potential for unexpected and nondeterministic
interactions between threads. This problem has traditionally been addressed
by devising tools for detecting race conditions, a
situation where two threads simultaneously access the same data variable,
and at least one of the accesses is a write. However, verifying the
absence of such simultaneous-access race conditions is neither necessary
nor sufficient to ensure the absence of errors due to unexpected thread
interactions. Recent work has proposed a stronger non-interference
property is required, namely atomicity. Atomic methods can
be assumed to execute serially, without interleaved steps of other
threads. Thus, atomic methods are amenable to sequential reasoning
techniques, which significantly simplifies both formal and informal
reasoning about program correctness. My lectures will introduce the
atomicity specification and present various analyses for checking this
Dissecting Transactional Semantics and Implementations
Suresh Jagannathan, Jan Vitek
There has been much recent interest
in exploring the role of software transactions as an alternative concurrency
control mechanism to mutual exclusion locks. While transactions can
provide numerous benefits with respect to programmability, safety,
and performance, their semantics and implementations are necessarily
However, by carefully examining different
feature sets for software transactions, we can better understand their
utility and limitations. In this course, we will cover semantic frameworks
for software transactions, and implementations inspired by these frameworks.
Topics include issues related to atomicity, nesting, multit-threaded
transactions, and language memory models.
We will also describe how to leverage
insights from the semantics to define implementations, not only for general-purpose
concurrent programming, but also for more constrained, but equally challenging
domains. Examples will include safe software-level speculation, atomic
regions for realtime systems, and lightweight checkpointing.
Adaptive Algorithms for new Parallel Supports
Bruno RAFFIN, Jean-Louis ROCH and Denis TRYSTRAM
Today, most of the high performance computations run on new parallel and
distributed systems (clusters or grid computing). They are characterized
by new features including heterogeneity and versatility of the resources.
Some computing or communication resources evolve (they become available
or are temporary utilized by other applications). This changing context
occurs not only for the resources, but also in the application itself:
an algorithm can be modified to react to some dynamic changes. Thus,
we have to imagine mechanisms able to adapt their behaviour to changes
during the execution without slow down the running time. This talk
will investigate several ways to address this important problem of
adaptivity focusing at the middleware level.
The design, implementation, and use of java.util.concurrent
This short course presents the synchronization
and concurrency control constructs available since Java release 5.0.
The presentation includes the discussion of some new blocking and non-blocking
algorithms designed for this package, adaptations of others, engineering
issues that arise in their development and widespread use, and examples
of practical usages. This course will alternate between lectures and
short hands-on exercises.
Bisimulation and co-induction
Bisimulation has been introduced
in Concurrency Theory as a way to establish when two processes are "the
same", i.e., undistinguishable to an external observer interacting
with them. Such a notion of undistinguishability is at the heart of
any theory of processes. The success of bisimulation is largely due
to the associated bisimulation proof method that can be used to prove
bisimilarities among processes.
Bisimulation is an instance of the
more general notion of co-induction. Nowadays, bisimulation and co-induction
are widely used, not only in Concurrency, but, more broadly, in Computer
Science, in a number of areas, including: functional languages, type
theory, data types, domains, databases, compiler optimisations, program
analysis, verification tools.
In the lectures, I will present bisimulation
and co-induction. In particular I will discuss their associated proof
method. If time permits, I will also discuss some questions and problems
that today are still unsettled.
A Theoretical Basis of Communication-Centred Programming
for Web Services
This lecture introduces a formalisation
of a novel programming discipline for Web Service, based on communication-centred
specifications. This work is based on long conversations and collaborations
with W3C's working group on a description language for web services called
choreography description language (CDL). My participation in W3C's standardisationon
process started three years ago, when three pi-calculus experts (Milner,
Honda and Yoshida) were invited to the CDL WG. Our specification resulted
in a standard which will become a W3C recommendation soon. It has also
led to a new principle of programming which consists of two distinct
ways for describing communication-centric software, one centring on global
message flows and another centring on end-point behaviour, distilled
as formal calculi. The two paradigms of description share a common feature,
structured representation of communications. The global calculus originates
from CDL itself, while the local calculus is based on the pi-calculus,
one of the representative calculi for communicating processes.
The theoretical part of the lecture
will illustrate these two frameworks using simple but non-trivial examples,
present the static and dynamic semantics of the calculi, and show a
basic theory of end-point projection --- that any well-formed description
in the global calculus has a precise representation in the local calculus.
The theory is intended to be a step towards a pi-calculus-based foundation
of Web Service technologies.
X10: Computing at Scale
We present the design of X10,
a modern programming language intended to address the twin challenges
of high performance and high productivity on high-end computers operating
with hundreds of thousands of hardware threads. At such scales, latency
and bandwidth can vary widely across the machine, making it impractical
to support (in mainstram architectures) a programming model based on
uniform shared memory. Similarly, it does not appear that at such scales
the SPMD model can provide the needed flexibility and performance (e.g.
because of load imbalance).
X10 develops a few simple and powerful
ideas. First, like other Partitioned Global Address Space languages,
X10 abandons the notion of a monolithic address space: a *place* contains
data, together with one or more activities operating on the data. A computation
may have millions of places. Aggregate data-structures, such as multi-dimensional
arrays, may be scattered across multiple places. Second, like Cilk, X10
emphasizes asynchronous, lightweight activities (recursive parallelism)
rather than SPMD computations. An activity may spawn other activities
locally or in remote places. Data must be operated upon only by activities
in that place. Multiple activities mutating shared data may use atomic
blocks (atomic stm) to ensure that data invariants are preserved. An
activity may wait for subactivities to terminate before progressing (finish
stm). Clocks (dynamic barriers) can be created dynamically and permit
activities registered on the clock to progress determinately in coordinated
Third, X10 leverages the flexibility,
versatility and modularity of the mainstream object-oriented model. X10
may be viewed as an extension of sequential Java to support concurrency
and distribution. Fourth, the X10 design guarantees that the programmer
cannot commit large classes of errors. X10 adds to the guarantees (such
as type safety, memory safety) it inherits from Java: X10 computations
that do not use conditional atomic blocks cannot deadlock.
Taken together, these ideas provide
a very rich, scalable and yet disciplined framework for concurrent programming
over distributed datastructures. X10 covers those areas which are today
handled by a combination of OpenMP, MPI and SPMD languages. We illustrate
through several programming examples (e.g. from NAS parallel benchmarks)
and discuss implementation issues.
Acute: Type-Safe and Version-Safe Distributed Programming
This course will discuss the
main programming language design issues addressed in the Acute language,
which supports type-safe and abstraction-safe communication, rebinding
to local resources, version change, and computation mobility. This
builds on the pi calculus and on the type theory underlying ML-style
module systems, and the first part of the course will introduce the