98 research outputs found
Process algebra approach to parallel DBMS performance modelling
Abstract unavailable please refer to PD
Modelling parallel database management systems for performance prediction
Abstract unavailable please refer to PD
Performance models of concurrency control protocols for transaction processing systems
Transaction processing plays a key role in a lot of IT infrastructures. It is widely used in a variety of contexts, spanning from database management systems to concurrent programming tools. Transaction processing systems leverage on concurrency control protocols, which allow them to concurrently process transactions preserving essential properties, as isolation and atomicity. Performance is a critical aspect of transaction processing systems, and it is unavoidably affected by the concurrency control. For this reason, methods and techniques to assess and predict the performance of concurrency control protocols are of interest for many IT players, including application designers, developers and system administrators. The analysis and the proper understanding of the impact on the system performance of these protocols require quantitative approaches. Analytical modeling is a practical approach for building cost-effective computer system performance models, enabling us to quantitatively describe the complex dynamics characterizing these systems. In this dissertation we present analytical performance models of concurrency control protocols. We deal with both traditional transaction processing systems, such as database management systems, and emerging ones, as transactional memories. The analysis focuses on widely used protocols, providing detailed performance models and validation studies. In addition, we propose new modeling approaches, which also broaden the scope of our study towards a more realistic, application-oriented, performance analysis
Merging Queries in OLTP Workloads
OLTP applications are usually executed by a high number of clients in parallel and are typically faced with high throughput demand as well as a constraint latency requirement for individual statements. In enterprise scenarios, they often face the challenge to deal with overload spikes resulting from events such as Cyber Monday or Black Friday. The traditional solution to prevent running out of resources and thus coping with such spikes is to use a significant over-provisioning of the underlying infrastructure. In this thesis, we analyze real enterprise OLTP workloads with respect to statement types, complexity, and hot-spot statements. Interestingly, our findings reveal that workloads are often read-heavy and comprise similar query patterns, which provides a potential to share work of statements belonging to different transactions. In the past, resource sharing has been extensively studied for OLAP workloads. Naturally, the question arises, why studies mainly focus on OLAP and not on OLTP workloads?
At first sight, OLTP queries often consist of simple calculations, such as index look-ups with little sharing potential. In consequence, such queries – due to their short execution time – may not have enough potential for the additional overhead. In addition, OLTP workloads do not only execute read operations but also updates. Therefore, sharing work needs to obey transactional semantics, such as the given isolation level and read-your-own-writes.
This thesis presents THE LEVIATHAN, a novel batching scheme for OLTP workloads, an approach for merging read statements within interactively submitted multi-statement transactions consisting of reads and updates. Our main idea is to merge the execution of statements by merging their plans, thus being able to merge the execution of not only complex, but also simple calculations, such as the aforementioned index look-up. We identify mergeable statements by pattern matching of prepared statement plans, which comes with low overhead. For obeying the isolation level properties and providing read-your-own-writes, we first define a formal framework for merging transactions running under a given isolation level and provide insights into a prototypical implementation of merging within a commercial database system.
Our experimental evaluation shows that, depending on the isolation level, the load in the system, and the read-share of the workload, an improvement of the transaction throughput by up to a factor of 2.5x is possible without compromising the transactional semantics. Another interesting effect we show is that with our strategy, we can increase the throughput of a real enterprise workload by 20%.:1 INTRODUCTION
1.1 Summary of Contributions
1.2 Outline
2 WORKLOAD ANALYSIS
2.1 Analyzing OLTP Benchmarks
2.1.1 YCSB
2.1.2 TATP
2.1.3 TPC Benchmark Scenarios
2.1.4 Summary
2.2 Analyzing OLTP Workloads from Open Source Projects
2.2.1 Characteristics of Workloads
2.2.2 Summary
2.3 Analyzing Enterprise OLTP Workloads
2.3.1 Overview of Reports about OLTP Workload Characteristics
2.3.2 Analysis of SAP Hybris Workload
2.3.3 Summary
2.4 Conclusion
3 RELATED WORK ON QUERY MERGING
3.1 Merging the Execution of Operators
3.2 Merging the Execution of Subplans
3.3 Merging the Results of Subplans
3.4 Merging the Execution of Full Plans
3.5 Miscellaneous Works on Merging
3.6 Discussion
4 MERGING STATEMENTS IN MULTI STATEMENT TRANSACTIONS
4.1 Overview of Our Approach
4.1.1 Examples
4.1.2 Why Naïve Merging Fails
4.2 THE LEVIATHAN Approach
4.3 Formalizing THE LEVIATHAN Approach
4.3.1 Transaction Theory
4.3.2 Merging Under MVCC
4.4 Merging Reads Under Different Isolation Levels
4.4.1 Read Uncommitted
4.4.2 Read Committed
4.4.3 Repeatable Read
4.4.4 Snapshot Isolation
4.4.5 Serializable
4.4.6 Discussion
4.5 Merging Writes Under Different Isolation Levels
4.5.1 Read Uncommitted
4.5.2 Read Committed
4.5.3 Snapshot Isolation
4.5.4 Serializable
4.5.5 Handling Dependencies
4.5.6 Discussion
5 SYSTEM MODEL
5.1 Definition of the Term “Overload”
5.2 Basic Queuing Model
5.2.1 Option (1): Replacement with a Merger Thread
5.2.2 Option (2): Adding Merger Thread
5.2.3 Using Multiple Merger Threads
5.2.4 Evaluation
5.3 Extended Queue Model
5.3.1 Option (1): Replacement with a Merger Thread
5.3.2 Option (2): Adding Merger Thread
5.3.3 Evaluation
6 IMPLEMENTATION
6.1 Background: SAP HANA
6.2 System Design
6.2.1 Read Committed
6.2.2 Snapshot Isolation
6.3 Merger Component
6.3.1 Overview
6.3.2 Dequeuing
6.3.3 Merging
6.3.4 Sending
6.3.5 Updating MTx State
6.4 Challenges in the Implementation of Merging Writes
6.4.1 SQL String Implementation
6.4.2 Update Count
6.4.3 Error Propagation
6.4.4 Abort and Rollback
7 EVALUATION
7.1 Benchmark Settings
7.2 System Settings
7.2.1 Experiment I: End-to-end Response Time Within a SAP Hybris System
7.2.2 Experiment II: Dequeuing Strategy
7.2.3 Experiment III: Merging Improvement on Different Statement, Transaction and Workload Types
7.2.4 Experiment IV: End-to-End Latency in YCSB
7.2.5 Experiment V: Breakdown of Execution in YCSB
7.2.6 Discussion of System Settings
7.3 Merging in Interactive Transactions
7.3.1 Experiment VI: Merging TATP in Read Uncommitted
7.3.2 Experiment VII: Merging TATP in Read Committed
7.3.3 Experiment VIII: Merging TATP in Snapshot Isolation
7.4 Merging Queries in Stored Procedures
Experiment IX: Merging TATP Stored Procedures in Read Committed
7.5 Merging SAP Hybris
7.5.1 Experiment X: CPU-time Breakdown on HANA Components
7.5.2 Experiment XI: Merging Media Query in SAP Hybris
7.5.3 Discussion of our Results in Comparison with Related Work
8 CONCLUSION
8.1 Summary
8.2 Future Research Directions
REFERENCES
A UML CLASS DIAGRAM
Space station data system analysis/architecture study. Task 2: Options development, DR-5. Volume 2: Design options
The primary objective of Task 2 is the development of an information base that will support the conduct of trade studies and provide sufficient data to make key design/programmatic decisions. This includes: (1) the establishment of option categories that are most likely to influence Space Station Data System (SSDS) definition; (2) the identification of preferred options in each category; and (3) the characterization of these options with respect to performance attributes, constraints, cost and risk. This volume contains the options development for the design category. This category comprises alternative structures, configurations and techniques that can be used to develop designs that are responsive to the SSDS requirements. The specific areas discussed are software, including data base management and distributed operating systems; system architecture, including fault tolerance and system growth/automation/autonomy and system interfaces; time management; and system security/privacy. Also discussed are space communications and local area networking
NASA RECON: Course Development, Administration, and Evaluation
The R and D activities addressing the development, administration, and evaluation of a set of transportable, college-level courses to educate science and engineering students in the effective use of automated scientific and technical information storage and retrieval systems, and, in particular, in the use of the NASA RECON system, are discussed. The long-range scope and objectives of these contracted activities are overviewed and the progress which has been made toward these objectives during FY 1983-1984 is highlighted. In addition, the results of a survey of 237 colleges and universities addressing course needs are presented
Performance Problem Diagnostics by Systematic Experimentation
Diagnostics of performance problems requires deep expertise in performance engineering and entails a high manual effort. As a consequence, performance evaluations are postponed to the last minute of the development process. In this thesis, we introduce an automatic, experiment-based approach for performance problem diagnostics in enterprise software systems. With this approach, performance engineers can concentrate on their core competences instead of conducting repeating tasks
Recommended from our members
Canonical approximation in the performance analysis of distributed systems
The problem of analyzing distributed systems arises in many areas of computer science, such as communication networks, distributed databases, packet radio networks, VLSI communications and switching mechanisms. Analysis of distributed systems is difficult since one must deal with many tightly-interacting components. The number of possible state configurations typically grows exponentially with the system size, making the exact analysis intractable even for relatively small systems. For the stochastic models of these systems, whose steady-state probability is of the product form, many global performance measures of interest can be computed once one knows the normalization constant of the steady-state probability distribution. This constant, called the system partition function, is typically difficult to derive in closed form. The key difficulty in performance analysis of such models can be viewed as trying to derive a good approximation to the partition function or calculate it numerically. In this Ph.D. work we introduce a new approximation technique to analyze a variety of such models of distributed systems. This technique, which we call the method of Canonical Approximation, is similar to that developed in statistical physics to compute the partition function. The new method gives a closed-form approximation of the partition function and of the global performance measures. It is computationally simple with complexity independent of the system size, gives an excellent degree of precision for large systems, and is applicable to a wide variety of problems. The method is applied to the analysis of multihop packet radio networks, locking schemes in database systems, closed queueing networks, and interconnection networks
- …