780 research outputs found
Speculation in Parallel and Distributed Event Processing Systems
Event stream processing (ESP) applications enable the real-time processing of continuous flows of data. Algorithmic trading, network monitoring, and processing data from sensor networks are good examples of applications that traditionally rely upon ESP systems. In addition, technological advances are resulting in an increasing number of devices that are network enabled, producing information that can be automatically collected and processed. This increasing availability of on-line data motivates the development of new and more sophisticated applications that require low-latency processing of large volumes of data.
ESP applications are composed of an acyclic graph of operators that is traversed by the data. Inside each operator, the events can be transformed, aggregated, enriched, or filtered out. Some of these operations depend only on the current input events, such operations are called stateless. Other operations, however, depend not only on the current event, but also on a state built during the processing of previous events. Such operations are, therefore, named stateful.
As the number of ESP applications grows, there are increasingly strong requirements, which are often difficult to satisfy. In this dissertation, we address two challenges created by the use of stateful operations in a ESP application: (i) stateful operators can be bottlenecks because they are sensitive to the order of events and cannot be trivially parallelized by replication; and (ii), if failures are to be tolerated, the accumulated state of an stateful operator needs to be saved, saving this state traditionally imposes considerable performance costs.
Our approach is to evaluate the use of speculation to address these two issues. For handling ordering and parallelization issues in a stateful operator, we propose a speculative approach that both reduces latency when the operator must wait for the correct ordering of the events and improves throughput when the operation in hand is parallelizable. In addition, our approach does not require that user understand concurrent programming or that he or she needs to consider out-of-order execution when writing the operations.
For fault-tolerant applications, traditional approaches have imposed prohibitive performance costs due to pessimistic schemes. We extend such approaches, using speculation to mask the cost of fault tolerance.:1 Introduction 1
1.1 Event stream processing systems ......................... 1
1.2 Running example ................................. 3
1.3 Challenges and contributions ........................... 4
1.4 Outline ...................................... 6
2 Background 7
2.1 Event stream processing ............................. 7
2.1.1 State in operators: Windows and synopses ............................ 8
2.1.2 Types of operators ............................ 12
2.1.3 Our prototype system........................... 13
2.2 Software transactional memory.......................... 18
2.2.1 Overview ................................. 18
2.2.2 Memory operations............................ 19
2.3 Fault tolerance in distributed systems ...................................... 23
2.3.1 Failure model and failure detection ...................................... 23
2.3.2 Recovery semantics............................ 24
2.3.3 Active and passive replication ...................... 24
2.4 Summary ..................................... 26
3 Extending event stream processing systems with speculation 27
3.1 Motivation..................................... 27
3.2 Goals ....................................... 28
3.3 Local versus distributed speculation ....................... 29
3.4 Models and assumptions ............................. 29
3.4.1 Operators................................. 30
3.4.2 Events................................... 30
3.4.3 Failures .................................. 31
4 Local speculation 33
4.1 Overview ..................................... 33
4.2 Requirements ................................... 35
4.2.1 Order ................................... 35
4.2.2 Aborts................................... 37
4.2.3 Optimism control ............................. 38
4.2.4 Notifications ............................... 39
4.3 Applications.................................... 40
4.3.1 Out-of-order processing ......................... 40
4.3.2 Optimistic parallelization......................... 42
4.4 Extensions..................................... 44
4.4.1 Avoiding unnecessary aborts ....................... 44
4.4.2 Making aborts unnecessary........................ 45
4.5 Evaluation..................................... 47
4.5.1 Overhead of speculation ......................... 47
4.5.2 Cost of misspeculation .......................... 50
4.5.3 Out-of-order and parallel processing micro benchmarks ........... 53
4.5.4 Behavior with example operators .................... 57
4.6 Summary ..................................... 60
5 Distributed speculation 63
5.1 Overview ..................................... 63
5.2 Requirements ................................... 64
5.2.1 Speculative events ............................ 64
5.2.2 Speculative accesses ........................... 69
5.2.3 Reliable ordered broadcast with optimistic delivery .................. 72
5.3 Applications .................................... 75
5.3.1 Passive replication and rollback recovery ................................ 75
5.3.2 Active replication ............................. 80
5.4 Extensions ..................................... 82
5.4.1 Active replication and software bugs ..................................... 82
5.4.2 Enabling operators to output multiple events ........................ 87
5.5 Evaluation .................................... 87
5.5.1 Passive replication ............................ 88
5.5.2 Active replication ............................. 88
5.6 Summary ..................................... 93
6 Related work 95
6.1 Event stream processing engines ......................... 95
6.2 Parallelization and optimistic computing ................................ 97
6.2.1 Speculation ................................ 97
6.2.2 Optimistic parallelization ......................... 98
6.2.3 Parallelization in event processing .................................... 99
6.2.4 Speculation in event processing ..................... 99
6.3 Fault tolerance .................................. 100
6.3.1 Passive replication and rollback recovery ............................... 100
6.3.2 Active replication ............................ 101
6.3.3 Fault tolerance in event stream processing systems ............. 103
7 Conclusions 105
7.1 Summary of contributions ............................ 105
7.2 Challenges and future work ............................ 106
Appendices
Publications 107
Pseudocode for the consensus protocol 10
Speculation in Parallel and Distributed Event Processing Systems
Event stream processing (ESP) applications enable the real-time processing of continuous flows of data. Algorithmic trading, network monitoring, and processing data from sensor networks are good examples of applications that traditionally rely upon ESP systems. In addition, technological advances are resulting in an increasing number of devices that are network enabled, producing information that can be automatically collected and processed. This increasing availability of on-line data motivates the development of new and more sophisticated applications that require low-latency processing of large volumes of data.
ESP applications are composed of an acyclic graph of operators that is traversed by the data. Inside each operator, the events can be transformed, aggregated, enriched, or filtered out. Some of these operations depend only on the current input events, such operations are called stateless. Other operations, however, depend not only on the current event, but also on a state built during the processing of previous events. Such operations are, therefore, named stateful.
As the number of ESP applications grows, there are increasingly strong requirements, which are often difficult to satisfy. In this dissertation, we address two challenges created by the use of stateful operations in a ESP application: (i) stateful operators can be bottlenecks because they are sensitive to the order of events and cannot be trivially parallelized by replication; and (ii), if failures are to be tolerated, the accumulated state of an stateful operator needs to be saved, saving this state traditionally imposes considerable performance costs.
Our approach is to evaluate the use of speculation to address these two issues. For handling ordering and parallelization issues in a stateful operator, we propose a speculative approach that both reduces latency when the operator must wait for the correct ordering of the events and improves throughput when the operation in hand is parallelizable. In addition, our approach does not require that user understand concurrent programming or that he or she needs to consider out-of-order execution when writing the operations.
For fault-tolerant applications, traditional approaches have imposed prohibitive performance costs due to pessimistic schemes. We extend such approaches, using speculation to mask the cost of fault tolerance.:1 Introduction 1
1.1 Event stream processing systems ......................... 1
1.2 Running example ................................. 3
1.3 Challenges and contributions ........................... 4
1.4 Outline ...................................... 6
2 Background 7
2.1 Event stream processing ............................. 7
2.1.1 State in operators: Windows and synopses ............................ 8
2.1.2 Types of operators ............................ 12
2.1.3 Our prototype system........................... 13
2.2 Software transactional memory.......................... 18
2.2.1 Overview ................................. 18
2.2.2 Memory operations............................ 19
2.3 Fault tolerance in distributed systems ...................................... 23
2.3.1 Failure model and failure detection ...................................... 23
2.3.2 Recovery semantics............................ 24
2.3.3 Active and passive replication ...................... 24
2.4 Summary ..................................... 26
3 Extending event stream processing systems with speculation 27
3.1 Motivation..................................... 27
3.2 Goals ....................................... 28
3.3 Local versus distributed speculation ....................... 29
3.4 Models and assumptions ............................. 29
3.4.1 Operators................................. 30
3.4.2 Events................................... 30
3.4.3 Failures .................................. 31
4 Local speculation 33
4.1 Overview ..................................... 33
4.2 Requirements ................................... 35
4.2.1 Order ................................... 35
4.2.2 Aborts................................... 37
4.2.3 Optimism control ............................. 38
4.2.4 Notifications ............................... 39
4.3 Applications.................................... 40
4.3.1 Out-of-order processing ......................... 40
4.3.2 Optimistic parallelization......................... 42
4.4 Extensions..................................... 44
4.4.1 Avoiding unnecessary aborts ....................... 44
4.4.2 Making aborts unnecessary........................ 45
4.5 Evaluation..................................... 47
4.5.1 Overhead of speculation ......................... 47
4.5.2 Cost of misspeculation .......................... 50
4.5.3 Out-of-order and parallel processing micro benchmarks ........... 53
4.5.4 Behavior with example operators .................... 57
4.6 Summary ..................................... 60
5 Distributed speculation 63
5.1 Overview ..................................... 63
5.2 Requirements ................................... 64
5.2.1 Speculative events ............................ 64
5.2.2 Speculative accesses ........................... 69
5.2.3 Reliable ordered broadcast with optimistic delivery .................. 72
5.3 Applications .................................... 75
5.3.1 Passive replication and rollback recovery ................................ 75
5.3.2 Active replication ............................. 80
5.4 Extensions ..................................... 82
5.4.1 Active replication and software bugs ..................................... 82
5.4.2 Enabling operators to output multiple events ........................ 87
5.5 Evaluation .................................... 87
5.5.1 Passive replication ............................ 88
5.5.2 Active replication ............................. 88
5.6 Summary ..................................... 93
6 Related work 95
6.1 Event stream processing engines ......................... 95
6.2 Parallelization and optimistic computing ................................ 97
6.2.1 Speculation ................................ 97
6.2.2 Optimistic parallelization ......................... 98
6.2.3 Parallelization in event processing .................................... 99
6.2.4 Speculation in event processing ..................... 99
6.3 Fault tolerance .................................. 100
6.3.1 Passive replication and rollback recovery ............................... 100
6.3.2 Active replication ............................ 101
6.3.3 Fault tolerance in event stream processing systems ............. 103
7 Conclusions 105
7.1 Summary of contributions ............................ 105
7.2 Challenges and future work ............................ 106
Appendices
Publications 107
Pseudocode for the consensus protocol 10
Grand Challenge: Real-time Destination and ETA Prediction for Maritime Traffic
In this paper, we present our approach for solving the DEBS Grand Challenge
2018. The challenge asks to provide a prediction for (i) a destination and the
(ii) arrival time of ships in a streaming-fashion using Geo-spatial data in the
maritime context. Novel aspects of our approach include the use of ensemble
learning based on Random Forest, Gradient Boosting Decision Trees (GBDT),
XGBoost Trees and Extremely Randomized Trees (ERT) in order to provide a
prediction for a destination while for the arrival time, we propose the use of
Feed-forward Neural Networks. In our evaluation, we were able to achieve an
accuracy of 97% for the port destination classification problem and 90% (in
mins) for the ETA prediction
Triad: Trusted Timestamps in Untrusted Environments
We aim to provide trusted time measurement mechanisms to applications and
cloud infrastructure deployed in environments that could harbor potential
adversaries, including the hardware infrastructure provider. Despite Trusted
Execution Environments (TEEs) providing multiple security functionalities,
timestamps from the Operating System are not covered. Nevertheless, some
services require time for validating permissions or ordering events. To address
that need, we introduce Triad, a trusted timestamp dispatcher of time readings.
The solution provides trusted timestamps enforced by mutually supportive
enclave-based clock servers that create a continuous trusted timeline. We
leverage enclave properties such as forced exits and CPU-based counters to
mitigate attacks on the server's timestamp counters. Triad produces trusted,
confidential, monotonically-increasing timestamps with bounded error and
desirable, non-trivial properties. Our implementation relies on Intel SGX and
SCONE, allowing transparent usage. We evaluate Triad's error and behavior in
multiple dimensions
Effect of lubricant degradation on the tribological performance of a wheel-rail system
Abstrac: In this work lubricant degradation and its relationship with tribological behavior was studied. Commercial grease (Sintono Terra HLK) and three versions of a newly developed product (Tribolub) for the Metro system of MedellĂn city were studied. A twin-disc testing machine was used to evaluate the effect of degradation caused by either radiation or mechanical action on the tribological properties of the greases; also, a number of tests were carried out to learn about the changes in rheological properties of the greases. The results showed that tribological testing of degraded greases using a twin-disc machine and a subsequent viscometric analysis can be considered as a viable option to study performance of degraded greases used for wheel-rail systems. All the greases studied showed a decrease in their tribological performance after mechanical degradation in twin-disc testing machine, being the degraded Tribolub-3 the one that allowed obtaining lower values of friction coefficient and mass loss of the samples. Both mechanical and radiation-induced degradation led to a significant increase in the viscosity of the greases studied, especially for low shear rates. FTIR analyses showed that such response was mainly caused by chemical changes in the structure of the greases. Viscosity-time curves showed a rheopectic behavior at low shear rates for Sintono Terra HLK and one of the versions of Tribolub, which was related to the particle contents and the effect of thickeners. Wetting tests showed better wettability for Tribolub than for Sintono Terra HLK, both before and after degradationMaestrĂ
- …