17,396 research outputs found
Maintaining consistency in distributed systems
In systems designed as assemblies of independently developed components, concurrent access to data or data structures normally arises within individual programs, and is controlled using mutual exclusion constructs, such as semaphores and monitors. Where data is persistent and/or sets of operation are related to one another, transactions or linearizability may be more appropriate. Systems that incorporate cooperative styles of distributed execution often replicate or distribute data within groups of components. In these cases, group oriented consistency properties must be maintained, and tools based on the virtual synchrony execution model greatly simplify the task confronting an application developer. All three styles of distributed computing are likely to be seen in future systems - often, within the same application. This leads us to propose an integrated approach that permits applications that use virtual synchrony with concurrent objects that respect a linearizability constraint, and vice versa. Transactional subsystems are treated as a special case of linearizability
A Template for Implementing Fast Lock-free Trees Using HTM
Algorithms that use hardware transactional memory (HTM) must provide a
software-only fallback path to guarantee progress. The design of the fallback
path can have a profound impact on performance. If the fallback path is allowed
to run concurrently with hardware transactions, then hardware transactions must
be instrumented, adding significant overhead. Otherwise, hardware transactions
must wait for any processes on the fallback path, causing concurrency
bottlenecks, or move to the fallback path. We introduce an approach that
combines the best of both worlds. The key idea is to use three execution paths:
an HTM fast path, an HTM middle path, and a software fallback path, such that
the middle path can run concurrently with each of the other two. The fast path
and fallback path do not run concurrently, so the fast path incurs no
instrumentation overhead. Furthermore, fast path transactions can move to the
middle path instead of waiting or moving to the software path. We demonstrate
our approach by producing an accelerated version of the tree update template of
Brown et al., which can be used to implement fast lock-free data structures
based on down-trees. We used the accelerated template to implement two
lock-free trees: a binary search tree (BST), and an (a,b)-tree (a
generalization of a B-tree). Experiments show that, with 72 concurrent
processes, our accelerated (a,b)-tree performs between 4.0x and 4.2x as many
operations per second as an implementation obtained using the original tree
update template
The "MIND" Scalable PIM Architecture
MIND (Memory, Intelligence, and Network Device) is an advanced parallel computer architecture for high performance computing and scalable embedded processing. It is a
Processor-in-Memory (PIM) architecture integrating both DRAM bit cells and CMOS logic devices on the same silicon die. MIND is multicore with multiple memory/processor nodes on
each chip and supports global shared memory across systems of MIND components. MIND is distinguished from other PIM architectures in that it incorporates mechanisms for efficient support of a global parallel execution model based on the semantics of message-driven multithreaded split-transaction processing. MIND is designed to operate either in conjunction with other conventional microprocessors or in standalone arrays of like devices. It also incorporates mechanisms for fault tolerance, real time execution, and active power management. This paper describes the major elements and operational methods of the MIND
architecture
Lock-free Concurrent Data Structures
Concurrent data structures are the data sharing side of parallel programming.
Data structures give the means to the program to store data, but also provide
operations to the program to access and manipulate these data. These operations
are implemented through algorithms that have to be efficient. In the sequential
setting, data structures are crucially important for the performance of the
respective computation. In the parallel programming setting, their importance
becomes more crucial because of the increased use of data and resource sharing
for utilizing parallelism.
The first and main goal of this chapter is to provide a sufficient background
and intuition to help the interested reader to navigate in the complex research
area of lock-free data structures. The second goal is to offer the programmer
familiarity to the subject that will allow her to use truly concurrent methods.Comment: To appear in "Programming Multi-core and Many-core Computing
Systems", eds. S. Pllana and F. Xhafa, Wiley Series on Parallel and
Distributed Computin
Open Transactions on Shared Memory
Transactional memory has arisen as a good way for solving many of the issues
of lock-based programming. However, most implementations admit isolated
transactions only, which are not adequate when we have to coordinate
communicating processes. To this end, in this paper we present OCTM, an
Haskell-like language with open transactions over shared transactional memory:
processes can join transactions at runtime just by accessing to shared
variables. Thus a transaction can co-operate with the environment through
shared variables, but if it is rolled-back, also all its effects on the
environment are retracted. For proving the expressive power of TCCS we give an
implementation of TCCS, a CCS-like calculus with open transactions
Software engineering and middleware: a roadmap (Invited talk)
The construction of a large class of distributed systems can be simplified by leveraging middleware, which is layered between network operating systems and application components. Middleware resolves heterogeneity and facilitates communication and coordination of distributed components. Existing middleware products enable software engineers to build systems that are distributed across a local-area network. State-of-the-art middleware research aims to push this boundary towards Internet-scale distribution, adaptive and reconfigurable middleware and middleware for dependable and wireless systems. The challenge for software engineering research is to devise notations, techniques, methods and tools for distributed system construction that systematically build and exploit the capabilities that middleware deliver
Energy-efficient and high-performance lock speculation hardware for embedded multicore systems
Embedded systems are becoming increasingly common in everyday life and like their general-purpose counterparts, they have shifted towards shared memory multicore architectures. However, they are much more resource constrained, and as they often run on batteries, energy efficiency becomes critically important. In such systems, achieving high concurrency is a key demand for delivering satisfactory performance at low energy cost. In order to achieve this high concurrency, consistency across the shared memory hierarchy must be accomplished in a cost-effective manner in terms of performance, energy, and implementation complexity. In this article, we propose Embedded-Spec, a hardware solution for supporting transparent lock speculation, without the requirement for special supporting instructions. Using this approach, we evaluate the energy consumption and performance of a suite of benchmarks, exploring a range of contention management and retry policies. We conclude that for resource-constrained platforms, lock speculation can provide real benefits in terms of improved concurrency and energy efficiency, as long as the underlying hardware support is carefully configured.This work is supported in part by NSF under Grants CCF-0903384, CCF-0903295, CNS-1319495, and CNS-1319095 as well the Semiconductor Research Corporation under grant number 1983.001. (CCF-0903384 - NSF; CCF-0903295 - NSF; CNS-1319495 - NSF; CNS-1319095 - NSF; 1983.001 - Semiconductor Research Corporation
- …