3,548 research outputs found
HeTM: Transactional Memory for Heterogeneous Systems
Modern heterogeneous computing architectures, which couple multi-core CPUs
with discrete many-core GPUs (or other specialized hardware accelerators),
enable unprecedented peak performance and energy efficiency levels.
Unfortunately, though, developing applications that can take full advantage of
the potential of heterogeneous systems is a notoriously hard task. This work
takes a step towards reducing the complexity of programming heterogeneous
systems by introducing the abstraction of Heterogeneous Transactional Memory
(HeTM). HeTM provides programmers with the illusion of a single memory region,
shared among the CPUs and the (discrete) GPU(s) of a heterogeneous system, with
support for atomic transactions. Besides introducing the abstract semantics and
programming model of HeTM, we present the design and evaluation of a concrete
implementation of the proposed abstraction, which we named Speculative HeTM
(SHeTM). SHeTM makes use of a novel design that leverages on speculative
techniques and aims at hiding the inherently large communication latency
between CPUs and discrete GPUs and at minimizing inter-device synchronization
overhead. SHeTM is based on a modular and extensible design that allows for
easily integrating alternative TM implementations on the CPU's and GPU's sides,
which allows the flexibility to adopt, on either side, the TM implementation
(e.g., in hardware or software) that best fits the applications' workload and
the architectural characteristics of the processing unit. We demonstrate the
efficiency of the SHeTM via an extensive quantitative study based both on
synthetic benchmarks and on a porting of a popular object caching system.Comment: The current work was accepted in the 28th International Conference on
Parallel Architectures and Compilation Techniques (PACT'19
Executing requests concurrently in state machine replication
State machine replication is one of the most popular ways to achieve fault tolerance. In
a nutshell, the state machine replication approach maintains multiple replicas that both
store a copy of the system’s data and execute operations on that data. When requests
to execute operations arrive, an “agree-execute” protocol keeps replicas synchronized:
they first agree on an order to execute the incoming operations, and then execute the
operations one at a time in the agreed upon order, so that every replica reaches the same
final state.
Multi-core processors are the norm, but taking advantage of the available processor
cores to execute operations simultaneously is at odds with the “agree-execute” protocol:
simultaneous execution is inherently unpredictable, so in the end replicas may arrive
at different final states and the system becomes inconsistent. On one hand, we want to
take advantage of the available processor cores to execute operations simultaneously and
improve performance. But on the other hand, replicas must abide by the operation order
that they agreed upon for the system to remain consistent. This dissertation proposes
a solution to this dilemma. At a high level, we propose to use speculative execution
techniques to execute operations simultaneously while nonetheless ensuring that their
execution is equivalent to having executed the operations sequentially in the order the
replicas agreed upon. To achieve this, we: (1) propose to execute operations as serializable
transactions, and (2) develop a new concurrency control protocol that ensures that the
concurrent execution of a set of transactions respects the serialization order the replicas
agreed upon. Since speculation is only effective if it is successful, we also (3) propose
a modification to the typical API to declare transactions, which allows transactions to
execute their logic over an abstract replica state, resulting in fewer conflicts between
transactions and thus improving the effectiveness of the speculative executions.
An experimental evaluation shows that the contributions in this dissertation can
improve the performance of a state-machine-replicated server up to 4 , reaching up to
75% the performance of a concurrent fault-prone server
Speculative execution by using software transactional memory
Dissertação apresentada na Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa para a obtenção do Grau de Mestre em Engenharia Informática.Many programs sequentially execute operations that take a long time to complete. Some of these operations may return a highly predictable result. If this is the case, speculative execution can improve the overall performance of the program.
Speculative execution is the execution of code whose result may not be needed. Generally it is used as a performance optimization. Instead of waiting for the result of a costly operation,speculative execution can be used to speculate the operation most probable result and continue
executing based in this speculation. If later the speculation is confirmed to be correct, time had been gained. Otherwise, if the speculation is incorrect, the execution based in the speculation must abort and re-execute with the correct result.
In this dissertation we propose the design of an abstract process to add speculative execution to a program by doing source-to-source transformation. This abstract process is used in the definition of a mechanism and methodology that enable programmer to add speculative execution to the source code of programs. The abstract process is also used in the design of an automatic source-to-source transformation process that adds speculative execution to existing programs without user intervention. Finally, we also evaluate the performance impact of introducing speculative execution in database clients.
Existing proposals for the design of mechanisms to add speculative execution lacked portability in favor of performance. Some were designed to be implemented at kernel or hardware level. The process and mechanisms we propose in this dissertation can add speculative execution to the source of program, independently of the kernel or hardware that is used.
From our experiments we have concluded that database clients can improve their performance by using speculative execution. There is nothing in the system we propose that limits in the scope of database clients. Although this was the scope of the case study, we strongly believe that other programs can benefit from the proposed process and mechanisms for introduction of speculative execution
Speculation in Parallel and Distributed Event Processing Systems
Event stream processing (ESP) applications enable the real-time processing of continuous flows of data. Algorithmic trading, network monitoring, and processing data from sensor networks are good examples of applications that traditionally rely upon ESP systems. In addition, technological advances are resulting in an increasing number of devices that are network enabled, producing information that can be automatically collected and processed. This increasing availability of on-line data motivates the development of new and more sophisticated applications that require low-latency processing of large volumes of data.
ESP applications are composed of an acyclic graph of operators that is traversed by the data. Inside each operator, the events can be transformed, aggregated, enriched, or filtered out. Some of these operations depend only on the current input events, such operations are called stateless. Other operations, however, depend not only on the current event, but also on a state built during the processing of previous events. Such operations are, therefore, named stateful.
As the number of ESP applications grows, there are increasingly strong requirements, which are often difficult to satisfy. In this dissertation, we address two challenges created by the use of stateful operations in a ESP application: (i) stateful operators can be bottlenecks because they are sensitive to the order of events and cannot be trivially parallelized by replication; and (ii), if failures are to be tolerated, the accumulated state of an stateful operator needs to be saved, saving this state traditionally imposes considerable performance costs.
Our approach is to evaluate the use of speculation to address these two issues. For handling ordering and parallelization issues in a stateful operator, we propose a speculative approach that both reduces latency when the operator must wait for the correct ordering of the events and improves throughput when the operation in hand is parallelizable. In addition, our approach does not require that user understand concurrent programming or that he or she needs to consider out-of-order execution when writing the operations.
For fault-tolerant applications, traditional approaches have imposed prohibitive performance costs due to pessimistic schemes. We extend such approaches, using speculation to mask the cost of fault tolerance.:1 Introduction 1
1.1 Event stream processing systems ......................... 1
1.2 Running example ................................. 3
1.3 Challenges and contributions ........................... 4
1.4 Outline ...................................... 6
2 Background 7
2.1 Event stream processing ............................. 7
2.1.1 State in operators: Windows and synopses ............................ 8
2.1.2 Types of operators ............................ 12
2.1.3 Our prototype system........................... 13
2.2 Software transactional memory.......................... 18
2.2.1 Overview ................................. 18
2.2.2 Memory operations............................ 19
2.3 Fault tolerance in distributed systems ...................................... 23
2.3.1 Failure model and failure detection ...................................... 23
2.3.2 Recovery semantics............................ 24
2.3.3 Active and passive replication ...................... 24
2.4 Summary ..................................... 26
3 Extending event stream processing systems with speculation 27
3.1 Motivation..................................... 27
3.2 Goals ....................................... 28
3.3 Local versus distributed speculation ....................... 29
3.4 Models and assumptions ............................. 29
3.4.1 Operators................................. 30
3.4.2 Events................................... 30
3.4.3 Failures .................................. 31
4 Local speculation 33
4.1 Overview ..................................... 33
4.2 Requirements ................................... 35
4.2.1 Order ................................... 35
4.2.2 Aborts................................... 37
4.2.3 Optimism control ............................. 38
4.2.4 Notifications ............................... 39
4.3 Applications.................................... 40
4.3.1 Out-of-order processing ......................... 40
4.3.2 Optimistic parallelization......................... 42
4.4 Extensions..................................... 44
4.4.1 Avoiding unnecessary aborts ....................... 44
4.4.2 Making aborts unnecessary........................ 45
4.5 Evaluation..................................... 47
4.5.1 Overhead of speculation ......................... 47
4.5.2 Cost of misspeculation .......................... 50
4.5.3 Out-of-order and parallel processing micro benchmarks ........... 53
4.5.4 Behavior with example operators .................... 57
4.6 Summary ..................................... 60
5 Distributed speculation 63
5.1 Overview ..................................... 63
5.2 Requirements ................................... 64
5.2.1 Speculative events ............................ 64
5.2.2 Speculative accesses ........................... 69
5.2.3 Reliable ordered broadcast with optimistic delivery .................. 72
5.3 Applications .................................... 75
5.3.1 Passive replication and rollback recovery ................................ 75
5.3.2 Active replication ............................. 80
5.4 Extensions ..................................... 82
5.4.1 Active replication and software bugs ..................................... 82
5.4.2 Enabling operators to output multiple events ........................ 87
5.5 Evaluation .................................... 87
5.5.1 Passive replication ............................ 88
5.5.2 Active replication ............................. 88
5.6 Summary ..................................... 93
6 Related work 95
6.1 Event stream processing engines ......................... 95
6.2 Parallelization and optimistic computing ................................ 97
6.2.1 Speculation ................................ 97
6.2.2 Optimistic parallelization ......................... 98
6.2.3 Parallelization in event processing .................................... 99
6.2.4 Speculation in event processing ..................... 99
6.3 Fault tolerance .................................. 100
6.3.1 Passive replication and rollback recovery ............................... 100
6.3.2 Active replication ............................ 101
6.3.3 Fault tolerance in event stream processing systems ............. 103
7 Conclusions 105
7.1 Summary of contributions ............................ 105
7.2 Challenges and future work ............................ 106
Appendices
Publications 107
Pseudocode for the consensus protocol 10
Speculation in Parallel and Distributed Event Processing Systems
Event stream processing (ESP) applications enable the real-time processing of continuous flows of data. Algorithmic trading, network monitoring, and processing data from sensor networks are good examples of applications that traditionally rely upon ESP systems. In addition, technological advances are resulting in an increasing number of devices that are network enabled, producing information that can be automatically collected and processed. This increasing availability of on-line data motivates the development of new and more sophisticated applications that require low-latency processing of large volumes of data.
ESP applications are composed of an acyclic graph of operators that is traversed by the data. Inside each operator, the events can be transformed, aggregated, enriched, or filtered out. Some of these operations depend only on the current input events, such operations are called stateless. Other operations, however, depend not only on the current event, but also on a state built during the processing of previous events. Such operations are, therefore, named stateful.
As the number of ESP applications grows, there are increasingly strong requirements, which are often difficult to satisfy. In this dissertation, we address two challenges created by the use of stateful operations in a ESP application: (i) stateful operators can be bottlenecks because they are sensitive to the order of events and cannot be trivially parallelized by replication; and (ii), if failures are to be tolerated, the accumulated state of an stateful operator needs to be saved, saving this state traditionally imposes considerable performance costs.
Our approach is to evaluate the use of speculation to address these two issues. For handling ordering and parallelization issues in a stateful operator, we propose a speculative approach that both reduces latency when the operator must wait for the correct ordering of the events and improves throughput when the operation in hand is parallelizable. In addition, our approach does not require that user understand concurrent programming or that he or she needs to consider out-of-order execution when writing the operations.
For fault-tolerant applications, traditional approaches have imposed prohibitive performance costs due to pessimistic schemes. We extend such approaches, using speculation to mask the cost of fault tolerance.:1 Introduction 1
1.1 Event stream processing systems ......................... 1
1.2 Running example ................................. 3
1.3 Challenges and contributions ........................... 4
1.4 Outline ...................................... 6
2 Background 7
2.1 Event stream processing ............................. 7
2.1.1 State in operators: Windows and synopses ............................ 8
2.1.2 Types of operators ............................ 12
2.1.3 Our prototype system........................... 13
2.2 Software transactional memory.......................... 18
2.2.1 Overview ................................. 18
2.2.2 Memory operations............................ 19
2.3 Fault tolerance in distributed systems ...................................... 23
2.3.1 Failure model and failure detection ...................................... 23
2.3.2 Recovery semantics............................ 24
2.3.3 Active and passive replication ...................... 24
2.4 Summary ..................................... 26
3 Extending event stream processing systems with speculation 27
3.1 Motivation..................................... 27
3.2 Goals ....................................... 28
3.3 Local versus distributed speculation ....................... 29
3.4 Models and assumptions ............................. 29
3.4.1 Operators................................. 30
3.4.2 Events................................... 30
3.4.3 Failures .................................. 31
4 Local speculation 33
4.1 Overview ..................................... 33
4.2 Requirements ................................... 35
4.2.1 Order ................................... 35
4.2.2 Aborts................................... 37
4.2.3 Optimism control ............................. 38
4.2.4 Notifications ............................... 39
4.3 Applications.................................... 40
4.3.1 Out-of-order processing ......................... 40
4.3.2 Optimistic parallelization......................... 42
4.4 Extensions..................................... 44
4.4.1 Avoiding unnecessary aborts ....................... 44
4.4.2 Making aborts unnecessary........................ 45
4.5 Evaluation..................................... 47
4.5.1 Overhead of speculation ......................... 47
4.5.2 Cost of misspeculation .......................... 50
4.5.3 Out-of-order and parallel processing micro benchmarks ........... 53
4.5.4 Behavior with example operators .................... 57
4.6 Summary ..................................... 60
5 Distributed speculation 63
5.1 Overview ..................................... 63
5.2 Requirements ................................... 64
5.2.1 Speculative events ............................ 64
5.2.2 Speculative accesses ........................... 69
5.2.3 Reliable ordered broadcast with optimistic delivery .................. 72
5.3 Applications .................................... 75
5.3.1 Passive replication and rollback recovery ................................ 75
5.3.2 Active replication ............................. 80
5.4 Extensions ..................................... 82
5.4.1 Active replication and software bugs ..................................... 82
5.4.2 Enabling operators to output multiple events ........................ 87
5.5 Evaluation .................................... 87
5.5.1 Passive replication ............................ 88
5.5.2 Active replication ............................. 88
5.6 Summary ..................................... 93
6 Related work 95
6.1 Event stream processing engines ......................... 95
6.2 Parallelization and optimistic computing ................................ 97
6.2.1 Speculation ................................ 97
6.2.2 Optimistic parallelization ......................... 98
6.2.3 Parallelization in event processing .................................... 99
6.2.4 Speculation in event processing ..................... 99
6.3 Fault tolerance .................................. 100
6.3.1 Passive replication and rollback recovery ............................... 100
6.3.2 Active replication ............................ 101
6.3.3 Fault tolerance in event stream processing systems ............. 103
7 Conclusions 105
7.1 Summary of contributions ............................ 105
7.2 Challenges and future work ............................ 106
Appendices
Publications 107
Pseudocode for the consensus protocol 10
H2O: An Autonomic, Resource-Aware Distributed Database System
This paper presents the design of an autonomic, resource-aware distributed
database which enables data to be backed up and shared without complex manual
administration. The database, H2O, is designed to make use of unused resources
on workstation machines. Creating and maintaining highly-available, replicated
database systems can be difficult for untrained users, and costly for IT
departments. H2O reduces the need for manual administration by autonomically
replicating data and load-balancing across machines in an enterprise.
Provisioning hardware to run a database system can be unnecessarily costly as
most organizations already possess large quantities of idle resources in
workstation machines. H2O is designed to utilize this unused capacity by using
resource availability information to place data and plan queries over
workstation machines that are already being used for other tasks. This paper
discusses the requirements for such a system and presents the design and
implementation of H2O.Comment: Presented at SICSA PhD Conference 2010 (http://www.sicsaconf.org/
RepComp - replicated software components for improved performance
Trabalho apresentado no âmbito do Mestrado em
Engenharia Informática, como requisito parcial
para obtenção do grau de Mestre em Engenharia
InformáticaThe current trend of evolution in CPU architectures favours increasing the number
of processing cores in lieu of improving the clock speed of an individual core. While
improving clock rates automatically benefits any software executing on that processor,
the same is not valid for adding new cores. To take advantage of an increased number
of cores, software must include explicit support for parallel execution.
This work explores a solution based on diverse replication which allows applications
to transparently explore parallel processing power: macro-components.
Applications typically make use of components with well-defined interfaces that
have a number of possible underlying implementations with different characteristic.
A macro-component is a component which encloses several of these implementations
while offering the same interface as a regular implementation. Inside the macro-component,the implementations are used as replicas, and used to process any incoming operations.
Using the best replica for each incoming operation, the macro-component is able
to improve global performance.
This dissertation provides an initial research on the use of these macro-components,detailing the technical challenges faced and proposing a design for the macro-component support system. Additionally, an implementation and subsequent validation of the proposed system are presented. These examples show that macro-components can achieve improved performance versus simple component implementations
- …