57 research outputs found

    The LDBC Financial Benchmark

    Full text link
    The Linked Data Benchmark Council's Financial Benchmark (LDBC FinBench) is a new effort that defines a graph database benchmark targeting financial scenarios such as anti-fraud and risk control. The benchmark has one workload, the Transaction Workload, currently. It captures OLTP scenario with complex, simple read queries and write queries that continuously insert or delete data in the graph. Compared to the LDBC SNB, the LDBC FinBench differs in application scenarios, data patterns, and query patterns. This document contains a detailed explanation of the data used in the LDBC FinBench, the definition of transaction workload, a detailed description for all queries, and instructions on how to use the benchmark suite.Comment: For the source code of this specification, see the ldbc_finbench_docs repository on Githu

    Merging Queries in OLTP Workloads

    Get PDF
    OLTP applications are usually executed by a high number of clients in parallel and are typically faced with high throughput demand as well as a constraint latency requirement for individual statements. In enterprise scenarios, they often face the challenge to deal with overload spikes resulting from events such as Cyber Monday or Black Friday. The traditional solution to prevent running out of resources and thus coping with such spikes is to use a significant over-provisioning of the underlying infrastructure. In this thesis, we analyze real enterprise OLTP workloads with respect to statement types, complexity, and hot-spot statements. Interestingly, our findings reveal that workloads are often read-heavy and comprise similar query patterns, which provides a potential to share work of statements belonging to different transactions. In the past, resource sharing has been extensively studied for OLAP workloads. Naturally, the question arises, why studies mainly focus on OLAP and not on OLTP workloads? At first sight, OLTP queries often consist of simple calculations, such as index look-ups with little sharing potential. In consequence, such queries – due to their short execution time – may not have enough potential for the additional overhead. In addition, OLTP workloads do not only execute read operations but also updates. Therefore, sharing work needs to obey transactional semantics, such as the given isolation level and read-your-own-writes. This thesis presents THE LEVIATHAN, a novel batching scheme for OLTP workloads, an approach for merging read statements within interactively submitted multi-statement transactions consisting of reads and updates. Our main idea is to merge the execution of statements by merging their plans, thus being able to merge the execution of not only complex, but also simple calculations, such as the aforementioned index look-up. We identify mergeable statements by pattern matching of prepared statement plans, which comes with low overhead. For obeying the isolation level properties and providing read-your-own-writes, we first define a formal framework for merging transactions running under a given isolation level and provide insights into a prototypical implementation of merging within a commercial database system. Our experimental evaluation shows that, depending on the isolation level, the load in the system, and the read-share of the workload, an improvement of the transaction throughput by up to a factor of 2.5x is possible without compromising the transactional semantics. Another interesting effect we show is that with our strategy, we can increase the throughput of a real enterprise workload by 20%.:1 INTRODUCTION 1.1 Summary of Contributions 1.2 Outline 2 WORKLOAD ANALYSIS 2.1 Analyzing OLTP Benchmarks 2.1.1 YCSB 2.1.2 TATP 2.1.3 TPC Benchmark Scenarios 2.1.4 Summary 2.2 Analyzing OLTP Workloads from Open Source Projects 2.2.1 Characteristics of Workloads 2.2.2 Summary 2.3 Analyzing Enterprise OLTP Workloads 2.3.1 Overview of Reports about OLTP Workload Characteristics 2.3.2 Analysis of SAP Hybris Workload 2.3.3 Summary 2.4 Conclusion 3 RELATED WORK ON QUERY MERGING 3.1 Merging the Execution of Operators 3.2 Merging the Execution of Subplans 3.3 Merging the Results of Subplans 3.4 Merging the Execution of Full Plans 3.5 Miscellaneous Works on Merging 3.6 Discussion 4 MERGING STATEMENTS IN MULTI STATEMENT TRANSACTIONS 4.1 Overview of Our Approach 4.1.1 Examples 4.1.2 Why Naïve Merging Fails 4.2 THE LEVIATHAN Approach 4.3 Formalizing THE LEVIATHAN Approach 4.3.1 Transaction Theory 4.3.2 Merging Under MVCC 4.4 Merging Reads Under Different Isolation Levels 4.4.1 Read Uncommitted 4.4.2 Read Committed 4.4.3 Repeatable Read 4.4.4 Snapshot Isolation 4.4.5 Serializable 4.4.6 Discussion 4.5 Merging Writes Under Different Isolation Levels 4.5.1 Read Uncommitted 4.5.2 Read Committed 4.5.3 Snapshot Isolation 4.5.4 Serializable 4.5.5 Handling Dependencies 4.5.6 Discussion 5 SYSTEM MODEL 5.1 Definition of the Term “Overload” 5.2 Basic Queuing Model 5.2.1 Option (1): Replacement with a Merger Thread 5.2.2 Option (2): Adding Merger Thread 5.2.3 Using Multiple Merger Threads 5.2.4 Evaluation 5.3 Extended Queue Model 5.3.1 Option (1): Replacement with a Merger Thread 5.3.2 Option (2): Adding Merger Thread 5.3.3 Evaluation 6 IMPLEMENTATION 6.1 Background: SAP HANA 6.2 System Design 6.2.1 Read Committed 6.2.2 Snapshot Isolation 6.3 Merger Component 6.3.1 Overview 6.3.2 Dequeuing 6.3.3 Merging 6.3.4 Sending 6.3.5 Updating MTx State 6.4 Challenges in the Implementation of Merging Writes 6.4.1 SQL String Implementation 6.4.2 Update Count 6.4.3 Error Propagation 6.4.4 Abort and Rollback 7 EVALUATION 7.1 Benchmark Settings 7.2 System Settings 7.2.1 Experiment I: End-to-end Response Time Within a SAP Hybris System 7.2.2 Experiment II: Dequeuing Strategy 7.2.3 Experiment III: Merging Improvement on Different Statement, Transaction and Workload Types 7.2.4 Experiment IV: End-to-End Latency in YCSB 7.2.5 Experiment V: Breakdown of Execution in YCSB 7.2.6 Discussion of System Settings 7.3 Merging in Interactive Transactions 7.3.1 Experiment VI: Merging TATP in Read Uncommitted 7.3.2 Experiment VII: Merging TATP in Read Committed 7.3.3 Experiment VIII: Merging TATP in Snapshot Isolation 7.4 Merging Queries in Stored Procedures Experiment IX: Merging TATP Stored Procedures in Read Committed 7.5 Merging SAP Hybris 7.5.1 Experiment X: CPU-time Breakdown on HANA Components 7.5.2 Experiment XI: Merging Media Query in SAP Hybris 7.5.3 Discussion of our Results in Comparison with Related Work 8 CONCLUSION 8.1 Summary 8.2 Future Research Directions REFERENCES A UML CLASS DIAGRAM

    Databases. Practicum

    Get PDF
    Практикум призначений для ознайомлення студентів з методикою використання мови SQL та сервера баз даних PostgreSQL як допоміжного навчального матеріалу для виконання лабораторних робіт з дисципліни «Бази даних». Починаючи із загальних ідей і термінології баз даних, посібник містить роз’яснення щодо інфологічного моделювання «Cутність-зв’язок», реляційної моделі як ядра мови SQL, а також синтаксису команд SQL і їх практичного використання в середовищі PostgreSQL. Застосування сервера PostgreSQL за допомогою графічного інтерфейсу користувача pgAdmin4, включаючи створення та зміну об’єктів бази даних, а також керування транзакціями є іншою важливою частиною практикуму. Навчальний посібник призначено для студентів денної форми навчання спеціальності 121 – «Інженерія програмного забезпечення» факультету прикладної математики КПІ ім. Ігоря Сікорського.The practicum is designed to acquaint students with the methodology of using SQL language and PostgreSQL database server as the support teaching materials for laboratory works of the "Databases" discipline. Starting with the general ideas and terminology of databases, the study guide provides clarifications on entity-relationship information modeling, the relational model as the core of the SQL language, as well as the syntax of SQL commands and their practical use in the PostgreSQL environment. The application of the PostgreSQL server by the pgAdmin4 GUI tool, including creating and modifying database objects, and the transactions management are the other significant parts of the practicum. The study guide is intended for full-time students of the speciality 121 - "Software Engineering" of Applied Mathematics Faculty of Igor Sikorsky Kyiv Polytechnic Institute

    Database Principles and Technologies – Based on Huawei GaussDB

    Get PDF
    This open access book contains eight chapters that deal with database technologies, including the development history of database, database fundamentals, introduction to SQL syntax, classification of SQL syntax, database security fundamentals, database development environment, database design fundamentals, and the application of Huawei’s cloud database product GaussDB database. This book can be used as a textbook for database courses in colleges and universities, and is also suitable as a reference book for the HCIA-GaussDB V1.5 certification examination. The Huawei GaussDB (for MySQL) used in the book is a Huawei cloud-based high-performance, highly applicable relational database that fully supports the syntax and functionality of the open source database MySQL. All the experiments in this book can be run on this database platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence

    Acta Cybernetica : Volume 25. Number 2.

    Get PDF

    Lazy State Determination for SQL databases

    Get PDF
    Transactional systems have seen various efforts to increase their throughput, mainly by making use of parallelism and efficient Concurrency Control techniques. Most approaches optimize the systems’ behaviour when under high contention. In this work, we strive towards reducing the system’s overall contention through Lazy State Determination (LSD). LSD is a new transactional API that leverages on futures to delay the accesses to the Database as much as possible, reducing the amount of time that transactions require to operate under isolation and, thus, reducing the contention window. LSD was shown to be a promising solution for Key-Value Stores. Now, our focus turns to Relational Database Management Systems, as we attempt to implement and evaluate LSD in this new setting. This implementation was done through a custom JDBC driver to minimize required modifications to any external platform. Results show that the reduction of the contention window effectively improves the success rate of transactional applications. However, our current implementation exhibits some performance issues that must be further investigated and addressed.Os sistemas transacionais têm sido alvo de esforços variados para aumentar a sua velocidade de processamento, principalmente através de paralelismo e de técnicas de controlo de concorrência mais eficazes. A maior parte das soluções propostas visam a otimização do comportamento destes sistemas em ambientes de elevada contenção. Neste trabalho, nós iremos reduzir a contenção no sistema recorrendo ao Lazy State Determination (LSD). O LSD é uma nova API transacional que promove a utilização de futuros para adiar o máximo os acessos à Base de Dados, reduzindo assim o tempo que cada transação requer para executar em isolamento e, por consequência, reduzindo também a janela de contenção. O LSD tem-se mostrado uma solução promissora para bases de dados Chave-Valor. O nosso foco foi agora redirecionado para Sistemas de Gestão de Bases de Dados Relacionais, com uma tentativa de implementação e avaliação do LSD neste novo contexto. Este objetivo foi concretizado através da implementação de um controlador JDBC para minimizar quaisquer alterações a plataformas externas. Os resultados mostram que a redução da janela de contenção efetivamente melhora a taxa de sucesso de aplicações transacionais. No entanto, a nossa implementação atual tem alguns problemas de desempenho que necessitam de ser investigados e endereçados

    ADDING PERSISTENCE TO MAIN MEMORY PROGRAMMING

    Get PDF
    Unlocking the true potential of the new persistent memories (PMEMs) requires eliminating traditional persistent I/O abstractions altogether, by introducing persistent semantics directly into main memory programming. Such a programming model elevates failure atomicity to a first-class application property in addition to in-memory data layout, concurrency-control, and fault tolerance, and therefore requires redesign of programming abstractions for both program correctness and maximum performance gains. To address these challenges, this thesis proposes a set of system software designs that integrate persistence with main memory programming, and makes the following contributions. First, this thesis proposes a PMEM-aware I/O runtime, NVStream, that supports fast durable streaming I/O. NVStream uses a memory-based I/O interface that integrates with existing I/O data movement operations of an application to accelerate persistent data writes. NVStream carefully designs its persistent data storage layout and crash-consistent semantics to match both application and PMEM characteristics. Specifically, we leverage the streaming nature of I/O in HPC workflows, to benefit from using a log-structured PMEM storage engine design, that uses relaxed write orderings and append-only failure-atomic semantics to form strongly consistent application checkpoints. Furthermore, we identify that optimizing the I/O software stack exposes the PMEM bandwidth limitations as a bottleneck during parallel HPC I/O writes, and propose a novel data movement design – PHX. PHX uses alternative network data movement paths available in datacenters to ease up the bandwidth pressure on the PMEM memory interconnects, all while maintaining the correctness of the persistent data. Next, the thesis explores the challenges and opportunities of using PMEM for true main memory persistent programming – a single data domain for both runtime and persistent applicationstate. Such a programming model includes maintaining ACID properties during each and every update to applications persistent structures. ACID-qualified persistent programming for multi-threaded applications is hard, as the programmer has to reason about both crash-consistency and synchronization – crash-sync – semantics for programming correctness. The thesis contributes new understanding of the correctness requirements for mixing different crash-consistent and synchronization protocols, characterizes the performance of different crash-sync realizations for different applications and hardware architectures, and draws actionable insights for future designs of PMEM systems. Finally, the application state stored on node-local persistent memory is still vulnerable to catastrophic node failures. The thesis proposes a replicated persistent memory runtime, Blizzard, that supports truly fault tolerant, concurrent and persistent data-structure programming. Blizzard carefully integrates userspace networking with byte addressable PMEM for a fast, persistent memory replication runtime. The design also incorporates a replication-aware crash-sync protocol that supports consistent and concurrent updates on persistent data-structures. Blizzard offers applications the flexibility to use the data structures that best match their functional requirements, while offering better performance, and providing crucial reliability guarantees lacking from existing persistent memory runtimes.Ph.D

    Ensuring compliance with data privacy and usage policies in online services

    Get PDF
    Online services collect and process a variety of sensitive personal data that is subject to complex privacy and usage policies. Complying with the policies is critical, often legally binding for service providers, but it is challenging as applications are prone to many disclosure threats. We present two compliance systems, Qapla and Pacer, that ensure efficient policy compliance in the face of direct and side-channel disclosures, respectively. Qapla prevents direct disclosures in database-backed applications (e.g., personnel management systems), which are subject to complex access control, data linking, and aggregation policies. Conventional methods inline policy checks with application code. Qapla instead specifies policies directly on the database and enforces them in a database adapter, thus separating compliance from the application code. Pacer prevents network side-channel leaks in cloud applications. A tenant’s secrets may leak via its network traffic shape, which can be observed at shared network links (e.g., network cards, switches). Pacer implements a cloaked tunnel abstraction, which hides secret-dependent variation in tenant’s traffic shape, but allows variations based on non-secret information, enabling secure and efficient use of network resources in the cloud. Both systems require modest development efforts, and incur moderate performance overheads, thus demonstrating their usability.Onlinedienste sammeln und verarbeiten eine Vielzahl sensibler persönlicher Daten, die komplexen Datenschutzrichtlinien unterliegen. Die Einhaltung dieser Richtlinien ist häufig rechtlich bindend für Dienstanbieter und gleichzeitig eine Herausforderung, da Fehler in Anwendungsprogrammen zu einer unabsichtlichen Offenlegung führen können. Wir präsentieren zwei Compliance-Systeme, Qapla und Pacer, die Richtlinien effizient einhalten und gegen direkte und indirekte Offenlegungen durch Seitenkanäle schützen. Qapla verhindert direkte Offenlegungen in datenbankgestützten Anwendungen. Herkömmliche Methoden binden Richtlinienprüfungen in Anwendungscode ein. Stattdessen gibt Qapla Richtlinien direkt in der Datenbank an und setzt sie in einem Datenbankadapter durch. Die Konformität ist somit vom Anwendungscode getrennt. Pacer verhindert Netzwerkseitenkanaloffenlegungen in Cloud-Anwendungen. Geheimnisse eines Nutzers können über die Form des Netzwerkverkehr offengelegt werden, die bei gemeinsam genutzten Netzwerkelementen (z. B. Netzwerkkarten, Switches) beobachtet werden kann. Pacer implementiert eine Tunnelabstraktion, die Geheimnisse im Netzwerkverkehr des Nutzers verbirgt, jedoch Variationen basier- end auf nicht geheimen Informationen zulässt und eine sichere und effiziente Nutzung der Netzwerkressourcen in der Cloud ermöglicht. Beide Systeme erfordern geringen Entwicklungsaufwand und verursachen einen moderaten Leistungsaufwand, wodurch ihre Nützlichkeit demonstriert wird

    Европейский и национальный контексты в научных исследованиях

    Get PDF
    В настоящем электронном сборнике «Европейский и национальный контексты в научных исследованиях. Технология» представлены работы молодых ученых по геодезии и картографии, химической технологии и машиностроению, информационным технологиям, строительству и радиотехнике. Предназначены для работников образования, науки и производства. Будут полезны студентам, магистрантам и аспирантам университетов.=In this Electronic collected materials “National and European dimension in research. Technology” works in the fields of geodesy, chemical technology, mechanical engineering, information technology, civil engineering, and radio-engineering are presented. It is intended for trainers, researchers and professionals. It can be useful for university graduate and post-graduate students
    corecore