Search CORE

10,652 research outputs found

The Parallel Persistent Memory Model

Author: Berryhill R.
Blelloch G. E.
Buettner M.
Chauhan H.
Herlihy M.
JaJa J.
Lee S. K.
Meena J. S.
Nawab F.
Pelley S.
Woude J. Van Der
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/06/2018
Field of study

We consider a parallel computational model that consists of

P

processors, each with a fast local ephemeral memory of limited size, and sharing a large persistent memory. The model allows for each processor to fault with bounded probability, and possibly restart. On faulting all processor state and local ephemeral memory are lost, but the persistent memory remains. This model is motivated by upcoming non-volatile memories that are as fast as existing random access memory, are accessible at the granularity of cache lines, and have the capability of surviving power outages. It is further motivated by the observation that in large parallel systems, failure of processors and their caches is not unusual. Within the model we develop a framework for developing locality efficient parallel algorithms that are resilient to failures. There are several challenges, including the need to recover from failures, the desire to do this in an asynchronous setting (i.e., not blocking other processors when one fails), and the need for synchronization primitives that are robust to failures. We describe approaches to solve these challenges based on breaking computations into what we call capsules, which have certain properties, and developing a work-stealing scheduler that functions properly within the context of failures. The scheduler guarantees a time bound of

O(W/P_A + D(P/P_A) \lceil\log_{1/f} W\rceil)

in expectation, where

W

and

D

are the work and depth of the computation (in the absence of failures),

P_A

is the average number of processors available during the computation, and

f \le 1/2

is the probability that a capsule fails. Within the model and using the proposed methods, we develop efficient algorithms for parallel sorting and other primitives.Comment: This paper is the full version of a paper at SPAA 2018 with the same nam

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Extending and Implementing the Self-adaptive Virtual Processor for Distributed Memory Architectures

Author: Koivisto Juha
van Tol Michiel W.
Publication venue
Publication date: 01/01/2011
Field of study

Many-core architectures of the future are likely to have distributed memory organizations and need fine grained concurrency management to be used effectively. The Self-adaptive Virtual Processor (SVP) is an abstract concurrent programming model which can provide this, but the model and its current implementations assume a single address space shared memory. We investigate and extend SVP to handle distributed environments, and discuss a prototype SVP implementation which transparently supports execution on heterogeneous distributed memory clusters over TCP/IP connections, while retaining the original SVP programming model

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications

A Flexible Framework For Implementing Multi-Nested Software Transaction Memory

Author: Dominic Damoah, Dr. J. B. Hayfron-Acquah, Edward D. Ansong, Selby L. Maxwell Jnr., Shamo Sebastian
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/11/2014
Field of study

Programming with locks is very difficult in multi-threaded programmes. Concurrency control of access to shared data limits scalable locking strategies otherwise provided for in software transaction memory. This work addresses the subject of creating dependable software in the face of eminent failures. In the past, programmers who used lock-based synchronization to implement concurrent access to shared data had to grapple with problems with conventional locking techniques such as deadlocks, convoying, and priority inversion. This paper proposes another advanced feature for Dynamic Software Transactional Memory intended to extend the concepts of transaction processing to provide a nesting mechanism and efficient lock-free synchronization, recoverability and restorability. In addition, the code for implementation has also been researched, coded, tested, and implemented to achieve the desired objectives

International Journal on Recent and Innovation Trends in Computing and Communication

Coded Caching for Delay-Sensitive Content

Author: Maddah-Ali Mohammad Ali
Niesen Urs
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/07/2014
Field of study

Coded caching is a recently proposed technique that achieves significant performance gains for cache networks compared to uncoded caching schemes. However, this substantial coding gain is attained at the cost of large delivery delay, which is not tolerable in delay-sensitive applications such as video streaming. In this paper, we identify and investigate the tradeoff between the performance gain of coded caching and the delivery delay. We propose a computationally efficient caching algorithm that provides the gains of coding and respects delay constraints. The proposed algorithm achieves the optimum performance for large delay, but still offers major gains for small delay. These gains are demonstrated in a practical setting with a video-streaming prototype.Comment: 9 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

Recommended from our members

Weakening WebAssembly

Author: Pichon-Pharabod Jean
Rossberg Andreas
Watt Conrad
Publication venue: Proceedings of the ACM on Programming Languages
Publication date: 10/10/2019
Field of study

WebAssembly (Wasm) is a safe, portable virtual instruction set that can be hosted in a wide range of environments, such as a Web browser. It is a low-level language whose instructions are intended to compile directly to bare hardware. While the initial version of Wasm focussed on single-threaded computation, a recent proposal extends it with low-level support for multiple threads and atomic instructions for synchronised access to shared memory. To support the correct compilation of concurrent programs, it is necessary to give a suitable specification of its memory model. Wasm’s language definition is based on a fully formalised specification that carefully avoids undefined behaviour. We present a substantial extension to this semantics, incorporating a relaxed memory model, along with a few proposed operational extensions. Wasm’s memory model is unique in that its linear address space can be dynamically grown during execution, while all accesses are bounds-checked. This leads to the novel problem of specifying how observations about the size of the memory can propagate between threads. We argue that, considering desirable compilation schemes, we cannot give a sequentially consistent semantics to memory growth. We show that our model guarantees Sequential Consistency of Data-Race-Free programs (SC-DRF). However, because Wasm is to run on the Web, we must also consider interoperability of its model with that of JavaScript. We show, by counter-example, that JavaScript’s memory model is not SC-DRF, in contrast to what is claimed in its specification. We propose two axiomatic conditions that should be added to the JavaScript model to correct this difference. We also describe a prototype SMT-based litmus tool which acts as an oracle for our axiomatic model, visualising its behaviours, including memory resizing

Apollo (Cambridge)