Projects
From Sampa
Contents |
Deterministic Shared Memory Multiprocessing (DMP)
One of the main reasons it is difficult to write multithreaded code is that current shared-memory multicore systems can execute code nondeterministically. Each time a multithreaded application runs, it can produce a different output even if supplied with the same input. This frustrates debugging efforts and limits the ability to properly test multithreaded code, becoming a major obstacle to widespread adoption of parallel programming.
Past efforts to address the problem of nondeterminism have primarily focused on deterministic replay and deterministic parallel programming models. The former is useful only for debugging, while the latter is typically domain-specific. In contrast, this project envisions that shared memory multiprocessor systems should always behave deterministically when executing any shared-memory parallel program. The core idea is to make inter-thread communication appear to be fully deterministic by guaranteeing equivalence to a deterministic serialized execution. This can be made efficient by employing speculation and exploiting properties of memory sharing behavior of applications.
This project includes: (1) developing an architecture for efficient deterministic multiprocessing, from mechanisms to the hardware/software interface, (2) addressing system issues, e.g., executing an operating system in deterministic multiprocessors, and (3) leveraging the deterministic multiprocessing hardware/software interface to create tools for debugging, testing and bug avoidance.
Recent results:
CoreDet: A Compiler and Runtime System for Deterministic Multithreaded Execution (ASPLOS 2010), Tom Bergan, Owen Anderson, Joseph Devietti, Luis Ceze and Dan Grossman.
DMP:Deterministic Shared Memory Multiprocessing (ASPLOS 2009), Joseph Devietti, Brandon Lucia, Luis Ceze and Mark Oskin. (An earlier version appeared in SHCMP'08 held with ISCA'08).
Dynamic Concurrency Bug Detection and Avoidance
With software complexity growing at a very fast pace comes software reliability issues. Today software failures pose a significant cost to our economy --- NIST estimates around $60B per year. Pervasive parallelism is likely to further increase the chances of software failures. While one could argue that the best way to address these challenges is to improve software engineering, it is likely that the problem will haunt us for a while due to legacy systems and the time required to fundamentally change the way software is developed. For that reason, we believe that a way of alleviating the problem is to design architectures and systems that significantly decrease the probability that software bugs manifest themselves and lead to a failure.
We are developing systems that can detect and survive a broad class of concurrency bugs. Concurrency bugs manifest nondeterministically, i.e., they depend upon the occurrence of certain bad interleavings. Interestingly, a given memory semantics exposed to the software can yield multiple valid global interleavings of memory operations. One can allow only a subset of these interleavings to avoid concurrency bugs while still exposing the same memory semantics to the software. We leverage this property and avoid concurrency bugs by monitoring when one is likely to happen and choosing a less dangerous interleaving. Since we don't want bugs to go simply unnoticed, we are also developing mechanism to reports bugs back to the programmer.
Our first step towards that goal was Atom-Aid, which is a multiprocessor architecture that can both detect and avoid atomicity violations. We are now expanding the class of bugs we can detect and avoid and are developing new machine learning-based techniques.
Recent results:
Atom-Aid: Detecting and Surviving Atomicity Violations (ISCA 2008), Brandon Lucia, Joseph Devietti, Karin Strauss and Luis Ceze. Selected to appear in IEEE Micro Top Picks 2009.
Finding Concurrency Bugs with Context-Aware Communication Graphs, (MICRO 2009), Brandon Lucia, Luis Ceze.
Support for Concurrency Exceptions
We argue that concurrency errors should be fail-stop. We want to put concurrency errors in the same category as segmentation fault and division-by-zero. This would make non-determinism in multithreaded execution be much more manageable. Concurrency exceptions would improve the debugging process during development and make crashes due to concurrency errors that happen in the field be more descriptive. Our goal with this project is to justify our position, propose a general approach to concurrency exceptions and discuss system requirements and implications. Specifically, we are investigating the semantics of concurrency exceptions at the language level, their implications in the compiler and runtime systems, how they should be delivered and finally how all they are enabled by efficient architecture support.
Recent results:
The Case for System Support for Concurrency Exceptions (USENIX HotPar 2009), Luis Ceze, Joseph Devietti, Brandon Lucia and Shaz Qadeer.
Organized Sharing (OSHA)
Being able to easily communicate between threads via shared memory is both a blessing and a curse. It is a blessing because it is very easy to move data and a curse because it is very easy to make mistakes. This project investigates a new way to describe the communication pattern of shared-memory concurrent programs. This investigation includes defining precise semantics for the meaning of interthread communication, tool-building to provide benefit to programmers, and careful evaluation to determine where the new way is most useful. The initial focus is on dynamic program analysis and its software-engineering benefits, but the proposed work crosses several areas, including static analysis, programming-language design, compiler optimization, and interactions with multicore hardware. This project will provide new insight into the structure of concurrent programs and how novel tools, analyses, and languages can assist programmers. The fundamental research question is how can we build and take advantage of a programming model that describes the communication structure of shared-memory programs in terms of the program structure.