Search CORE

8,409 research outputs found

IST Austria Thesis

Author: Tarrach Thorsten
Publication venue: IST Austria
Publication date: 01/01/2016
Field of study

In this thesis we present a computer-aided programming approach to concurrency. Our approach helps the programmer by automatically fixing concurrency-related bugs, i.e. bugs that occur when the program is executed using an aggressive preemptive scheduler, but not when using a non-preemptive (cooperative) scheduler. Bugs are program behaviours that are incorrect w.r.t. a specification. We consider both user-provided explicit specifications in the form of assertion statements in the code as well as an implicit specification. The implicit specification is inferred from the non-preemptive behaviour. Let us consider sequences of calls that the program makes to an external interface. The implicit specification requires that any such sequence produced under a preemptive scheduler should be included in the set of sequences produced under a non-preemptive scheduler. We consider several semantics-preserving fixes that go beyond atomic sections typically explored in the synchronisation synthesis literature. Our synthesis is able to place locks, barriers and wait-signal statements and last, but not least reorder independent statements. The latter may be useful if a thread is released to early, e.g., before some initialisation is completed. We guarantee that our synthesis does not introduce deadlocks and that the synchronisation inserted is optimal w.r.t. a given objective function. We dub our solution trace-based synchronisation synthesis and it is loosely based on counterexample-guided inductive synthesis (CEGIS). The synthesis works by discovering a trace that is incorrect w.r.t. the specification and identifying ordering constraints crucial to trigger the specification violation. Synchronisation may be placed immediately (greedy approach) or delayed until all incorrect traces are found (non-greedy approach). For the non-greedy approach we construct a set of global constraints over synchronisation placements. Each model of the global constraints set corresponds to a correctness-ensuring synchronisation placement. The placement that is optimal w.r.t. the given objective function is chosen as the synchronisation solution. We evaluate our approach on a number of realistic (albeit simplified) Linux device-driver benchmarks. The benchmarks are versions of the drivers with known concurrency-related bugs. For the experiments with an explicit specification we added assertions that would detect the bugs in the experiments. Device drivers lend themselves to implicit specification, where the device and the operating system are the external interfaces. Our experiments demonstrate that our synthesis method is precise and efficient. We implemented objective functions for coarse-grained and fine-grained locking and observed that different synchronisation placements are produced for our experiments, favouring e.g. a minimal number of synchronisation operations or maximum concurrency

IST Austria: PubRep (Institute of Science and Technology)

Bounded Concurrent Timestamp Systems Using Vector Clocks

Author: Haldar Sibsankar
Vitanyi Paul
Publication venue
Publication date: 02/08/2001
Field of study

Shared registers are basic objects used as communication mediums in asynchronous concurrent computation. A concurrent timestamp system is a higher typed communication object, and has been shown to be a powerful tool to solve many concurrency control problems. It has turned out to be possible to construct such higher typed objects from primitive lower typed ones. The next step is to find efficient constructions. We propose a very efficient wait-free construction of bounded concurrent timestamp systems from 1-writer multireader registers. This finalizes, corrects, and extends, a preliminary bounded multiwriter construction proposed by the second author in 1986. That work partially initiated the current interest in wait-free concurrent objects, and introduced a notion of discrete vector clocks in distributed algorithms.Comment: LaTeX source, 35 pages; To apper in: J. Assoc. Comp. Mac

arXiv.org e-Print Archive

CiteSeerX

International Migration, Integration and Social Cohesion online publications

Lock-free Concurrent Data Structures

Author: Cederman Daniel
Gidenstam Anders
Ha Phuong
Papatriantafilou Marina
Sundell Håkan
Tsigas Philippas
Publication venue
Publication date: 01/01/2013
Field of study

Concurrent data structures are the data sharing side of parallel programming. Data structures give the means to the program to store data, but also provide operations to the program to access and manipulate these data. These operations are implemented through algorithms that have to be efficient. In the sequential setting, data structures are crucially important for the performance of the respective computation. In the parallel programming setting, their importance becomes more crucial because of the increased use of data and resource sharing for utilizing parallelism. The first and main goal of this chapter is to provide a sufficient background and intuition to help the interested reader to navigate in the complex research area of lock-free data structures. The second goal is to offer the programmer familiarity to the subject that will allow her to use truly concurrent methods.Comment: To appear in "Programming Multi-core and Many-core Computing Systems", eds. S. Pllana and F. Xhafa, Wiley Series on Parallel and Distributed Computin

arXiv.org e-Print Archive

Chalmers Research

Admit your weakness: Verifying correctness on TSO architectures

Author: A Gotsman
C Calcagno
D Amit
DJ Sorin
J Aspnes
J Derrick
J Derrick
J Derrick
JM Spivey
L Lamport
M Herlihy
M Herlihy
N Shavit
N Shavit
P Sewell
S Burckhardt
S Doherty
S Owens
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

“The final publication is available at http://link.springer.com/chapter/10.1007%2F978-3-319-15317-9_22 ”.Linearizability has become the standard correctness criterion for fine-grained non-atomic concurrent algorithms, however, most approaches assume a sequentially consistent memory model, which is not always realised in practice. In this paper we study the correctness of concurrent algorithms on a weak memory model: the TSO (Total Store Order) memory model, which is commonly implemented by multicore architectures. Here, linearizability is often too strict, and hence, we prove a weaker criterion, quiescent consistency instead. Like linearizability, quiescent consistency is compositional making it an ideal correctness criterion in a component-based context. We demonstrate how to model a typical concurrent algorithm, seqlock, and prove it quiescent consistent using a simulation-based approach. Previous approaches to proving correctness on TSO architectures have been based on linearizabilty which makes it necessary to modify the algorithm’s high-level requirements. Our approach is the first, to our knowledge, for proving correctness without the need for such a modification

Crossref

Brunel University Research Archive

Macroservers: An Execution Model for DRAM Processor-In-Memory Arrays

Author: Sterling Thomas L.
Zima Hans P.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2000
Field of study

The emergence of semiconductor fabrication technology allowing a tight coupling between high-density DRAM and CMOS logic on the same chip has led to the important new class of Processor-In-Memory (PIM) architectures. Newer developments provide powerful parallel processing capabilities on the chip, exploiting the facility to load wide words in single memory accesses and supporting complex address manipulations in the memory. Furthermore, large arrays of PIMs can be arranged into a massively parallel architecture. In this report, we describe an object-based programming model based on the notion of a macroserver. Macroservers encapsulate a set of variables and methods; threads, spawned by the activation of methods, operate asynchronously on the variables' state space. Data distributions provide a mechanism for mapping large data structures across the memory region of a macroserver, while work distributions allow explicit control of bindings between threads and data. Both data and work distributuions are first-class objects of the model, supporting the dynamic management of data and threads in memory. This offers the flexibility required for fully exploiting the processing power and memory bandwidth of a PIM array, in particular for irregular and adaptive applications. Thread synchronization is based on atomic methods, condition variables, and futures. A special type of lightweight macroserver allows the formulation of flexible scheduling strategies for the access to resources, using a monitor-like mechanism

CiteSeerX

Caltech Authors