593 research outputs found
HeTM: Transactional Memory for Heterogeneous Systems
Modern heterogeneous computing architectures, which couple multi-core CPUs
with discrete many-core GPUs (or other specialized hardware accelerators),
enable unprecedented peak performance and energy efficiency levels.
Unfortunately, though, developing applications that can take full advantage of
the potential of heterogeneous systems is a notoriously hard task. This work
takes a step towards reducing the complexity of programming heterogeneous
systems by introducing the abstraction of Heterogeneous Transactional Memory
(HeTM). HeTM provides programmers with the illusion of a single memory region,
shared among the CPUs and the (discrete) GPU(s) of a heterogeneous system, with
support for atomic transactions. Besides introducing the abstract semantics and
programming model of HeTM, we present the design and evaluation of a concrete
implementation of the proposed abstraction, which we named Speculative HeTM
(SHeTM). SHeTM makes use of a novel design that leverages on speculative
techniques and aims at hiding the inherently large communication latency
between CPUs and discrete GPUs and at minimizing inter-device synchronization
overhead. SHeTM is based on a modular and extensible design that allows for
easily integrating alternative TM implementations on the CPU's and GPU's sides,
which allows the flexibility to adopt, on either side, the TM implementation
(e.g., in hardware or software) that best fits the applications' workload and
the architectural characteristics of the processing unit. We demonstrate the
efficiency of the SHeTM via an extensive quantitative study based both on
synthetic benchmarks and on a porting of a popular object caching system.Comment: The current work was accepted in the 28th International Conference on
Parallel Architectures and Compilation Techniques (PACT'19
How do programs become more concurrent? A story of program transformations
For several decades, programmers have relied onMooreâ s Law to improve the performance of their softwareapplications. From now on, programmers need to programthe multi-cores if they want to deliver efficient code. Inthe multi-core era, a major maintenance task will be tomake sequential programs more concurrent. What are themost common transformations to retrofit concurrency intosequential programs?We studied the source code of 5 open-source Javaprojects. We analyzed qualitatively and quantitatively thechange patterns that developers have used in order toretrofit concurrency. We found that these transformationsbelong to four categories: transformations that improve thelatency, the throughput, the scalability, or correctness of theapplications. In addition, we report on our experience ofparallelizing one of our own programs. Our findings caneducate software developers on how to parallelize sequentialprograms, and can provide hints for tool vendors aboutwhat transformations are worth automating
Reverse Engineering of Middleware for Verification of Robot Control Architectures
We consider the problem of automating the verification of distributed control
software relying on publish-subscribe middleware. In this scenario, the main
challenge is that software correctness depends intrinsically on correct usage
of middleware components, but structured models of such components might not be
available for analysis, e.g., because they are too large and complex to be
described precisely in a cost-effective way. To overcome this problem, we
propose to identify abstract models of middleware as finite-state automata, and
then to perform verification on the combined middleware and control software
models. Both steps are carried out in a computer-assisted way using
state-of-the-art techniques in automata-based identification and verification.
Our main contribution is to show that the combination of identification and
verification is feasible and useful when considering typical issues that arise
in the implementation of distributed control software.Comment: 14 pages, 4 figures. The final version of the article is published in
Proc. of "Simulation, Modeling, and Programming for Autonomous Robots",
SIMPAR 2014 (published by Springer
Plug & Test at System Level via Testable TLM Primitives
With the evolution of Electronic System Level (ESL) design methodologies, we are experiencing an extensive use of Transaction-Level Modeling (TLM). TLM is a high-level approach to modeling digital systems where details of the communication among modules are separated from the those of the implementation of functional units. This paper represents a first step toward the automatic insertion of testing capabilities at the transaction level by definition of testable TLM primitives. The use of testable TLM primitives should help designers to easily get testable transaction level descriptions implementing what we call a "Plug & Test" design methodology. The proposed approach is intended to work both with hardware and software implementations. In particular, in this paper we will focus on the design of a testable FIFO communication channel to show how designers are given the freedom of trading-off complexity, testability levels, and cos
Middleware-based Database Replication: The Gaps between Theory and Practice
The need for high availability and performance in data management systems has
been fueling a long running interest in database replication from both academia
and industry. However, academic groups often attack replication problems in
isolation, overlooking the need for completeness in their solutions, while
commercial teams take a holistic approach that often misses opportunities for
fundamental innovation. This has created over time a gap between academic
research and industrial practice.
This paper aims to characterize the gap along three axes: performance,
availability, and administration. We build on our own experience developing and
deploying replication systems in commercial and academic settings, as well as
on a large body of prior related work. We sift through representative examples
from the last decade of open-source, academic, and commercial database
replication systems and combine this material with case studies from real
systems deployed at Fortune 500 customers. We propose two agendas, one for
academic research and one for industrial R&D, which we believe can bridge the
gap within 5-10 years. This way, we hope to both motivate and help researchers
in making the theory and practice of middleware-based database replication more
relevant to each other.Comment: 14 pages. Appears in Proc. ACM SIGMOD International Conference on
Management of Data, Vancouver, Canada, June 200
Bounded Model Checking of Concurrent Data Types on Relaxed Memory Models: A Case Study
Many multithreaded programs employ concurrent data types to safely share data among threads. However, highly-concurrent algorithms for even seemingly simple data types are difficult to implement correctly, especially when considering the relaxed memory ordering models commonly employed by today’s multiprocessors. The formal verification of such implementations is challenging as well because the high degree of concurrency leads to a large number of possible executions. In this case study, we develop a SAT-based bounded verification method and apply it to a representative example, a well-known two-lock concurrent queue algorithm. We first formulate a correctness criterion that specifically targets failures caused by concurrency; it demands that all concurrent executions be observationally equivalent to some serial execution. Next, we define a relaxed memory model that conservatively approximates several common shared-memory multiprocessors. Using commit point specifications, a suite of finite symbolic tests, a prototype encoder, and a standard SAT solver, we successfully identify two failures of a naive implementation that can be observed only under relaxed memory models. We eliminate these failures by inserting appropriate memory ordering fences into the code. The experiments confirm that our approach provides a valuable aid for desigining and implementing concurrent data types
Interprocess communication in highly distributed systems
Issued as Final technical report, Project no. G-36-632Final technical report has title: Interprocess communication in highly distributed system
- …