287 research outputs found

    Performance Optimization Strategies for Transactional Memory Applications

    Get PDF
    This thesis presents tools for Transactional Memory (TM) applications that cover multiple TM systems (Software, Hardware, and hybrid TM) and use information of all different layers of the TM software stack. Therefore, this thesis addresses a number of challenges to extract static information, information about the run time behavior, and expert-level knowledge to develop these new methods and strategies for the optimization of TM applications

    OSCAR. A Noise Injection Framework for Testing Concurrent Software

    Get PDF
    “Moore’s Law” is a well-known observable phenomenon in computer science that describes a visible yearly pattern in processor’s die increase. Even though it has held true for the last 57 years, thermal limitations on how much a processor’s core frequencies can be increased, have led to physical limitations to their performance scaling. The industry has since then shifted towards multicore architectures, which offer much better and scalable performance, while in turn forcing programmers to adopt the concurrent programming paradigm when designing new software, if they wish to make use of this added performance. The use of this paradigm comes with the unfortunate downside of the sudden appearance of a plethora of additional errors in their programs, stemming directly from their (poor) use of concurrency techniques. Furthermore, these concurrent programs themselves are notoriously hard to design and to verify their correctness, with researchers continuously developing new, more effective and effi- cient methods of doing so. Noise injection, the theme of this dissertation, is one such method. It relies on the “probe effect” — the observable shift in the behaviour of concurrent programs upon the introduction of noise into their routines. The abandonment of ConTest, a popular proprietary and closed-source noise injection framework, for testing concurrent software written using the Java programming language, has left a void in the availability of noise injection frameworks for this programming language. To mitigate this void, this dissertation proposes OSCAR — a novel open-source noise injection framework for the Java programming language, relying on static bytecode instrumentation for injecting noise. OSCAR will provide a free and well-documented noise injection tool for research, pedagogical and industry usage. Additionally, we propose a novel taxonomy for categorizing new and existing noise injection heuristics, together with a new method for generating and analysing concurrent software traces, based on string comparison metrics. After noising programs from the IBM Concurrent Benchmark with different heuristics, we observed that OSCAR is highly effective in increasing the coverage of the interleaving space, and that the different heuristics provide diverse trade-offs on the cost and benefit (time/coverage) of the noise injection process.Resumo A “Lei de Moore” é um fenómeno, bem conhecido na área das ciências da computação, que descreve um padrão evidente no aumento anual da densidade de transístores num processador. Mesmo mantendo-se válido nos últimos 57 anos, o aumento do desempenho dos processadores continua garrotado pelas limitações térmicas inerentes `a subida da sua frequência de funciona- mento. Desde então, a industria transitou para arquiteturas multi núcleo, com significativamente melhor e mais escalável desempenho, mas obrigando os programadores a adotar o paradigma de programação concorrente ao desenhar os seus novos programas, para poderem aproveitar o desempenho adicional que advém do seu uso. O uso deste paradigma, no entanto, traz consigo, por consequência, a introdução de uma panóplia de novos erros nos programas, decorrentes diretamente da utilização (inadequada) de técnicas de programação concorrente. Adicionalmente, estes programas concorrentes são conhecidos por serem consideravelmente mais difíceis de desenhar e de validar, quanto ao seu correto funcionamento, incentivando investi- gadores ao desenvolvimento de novos métodos mais eficientes e eficazes de o fazerem. A injeção de ruído, o tema principal desta dissertação, é um destes métodos. Esta baseia-se no “efeito sonda” (do inglês “probe effect”) — caracterizado por uma mudança de comportamento observável em programas concorrentes, ao terem ruído introduzido nas suas rotinas. Com o abandono do Con- Test, uma framework popular, proprietária e de código fechado, de análise dinâmica de programas concorrentes através de injecção de ruído, escritos com recurso `a linguagem de programação Java, viu-se surgir um vazio na oferta de framework de injeção de ruído, para esta mesma linguagem. Para mitigar este vazio, esta dissertação propõe o OSCAR — uma nova framework de injeção de ruído, de código-aberto, para a linguagem de programação Java, que utiliza manipulação estática de bytecode para realizar a introdução de ruído. O OSCAR pretende oferecer uma ferramenta livre e bem documentada de injeção de ruído para fins de investigação, pedagógicos ou até para a indústria. Adicionalmente, a dissertação propõe uma nova taxonomia para categorizar os dife- rentes tipos de heurísticas de injecção de ruídos novos e existentes, juntamente com um método para gerar e analisar traces de programas concorrentes, com base em métricas de comparação de strings. Após inserir ruído em programas do IBM Concurrent Benchmark, com diversas heurísticas, ob- servámos que o OSCAR consegue aumentar significativamente a dimensão da cobertura do espaço de estados de programas concorrentes. Adicionalmente, verificou-se que diferentes heurísticas produzem um leque variado de prós e contras, especialmente em termos de eficácia versus eficiência

    Analyse des performances de stockage, en mémoire et sur les périphériques d'entrée/sortie, à partir d'une trace d'exécution

    Get PDF
    Le stockage des données est vital pour l’industrie informatique. Les supports de stockage doivent être rapides et fiables pour répondre aux demandes croissantes des entreprises. Les technologies de stockage peuvent être classifiées en deux catégories principales : stockage de masse et stockage en mémoire. Le stockage de masse permet de sauvegarder une grande quantité de données à long terme. Les données sont enregistrées localement sur des périphériques d’entrée/sortie, comme les disques durs (HDD) et les Solid-State Drive (SSD), ou en ligne sur des systèmes de stockage distribué. Le stockage en mémoire permet de garder temporairement les données nécessaires pour les programmes en cours d’exécution. La mémoire vive est caractérisée par sa rapidité d’accès, indispensable pour fournir rapidement les données à l’unité de calcul du processeur. Les systèmes d’exploitation utilisent plusieurs mécanismes pour gérer les périphériques de stockage, par exemple les ordonnanceurs de disque et les allocateurs de mémoire. Le temps de traitement d’une requête de stockage est affecté par l’interaction entre plusieurs soussystèmes, ce qui complique la tâche de débogage. Les outils existants, comme les outils d’étalonnage, permettent de donner une vague idée sur la performance globale du système, mais ne permettent pas d’identifier précisément les causes d’une mauvaise performance. L’analyse dynamique par trace d’exécution est très utile pour l’étude de performance des systèmes. Le traçage permet de collecter des données précises sur le fonctionnement du système, ce qui permet de détecter des problèmes de performance difficilement identifiables. L’objectif de cette thèse est de fournir un outil permettant d’analyser les performances de stockage, en mémoire et sur les périphériques d’entrée/sortie, en se basant sur les traces d’exécution. Les défis relevés par cet outil sont : collecter les données nécessaires à l’analyse depuis le noyau et les programmes en mode utilisateur, limiter le surcoût du traçage et la taille des traces générées, synchroniser les différentes traces, fournir des analyses multiniveau couvrant plusieurs aspects de la performance et enfin proposer des abstractions permettant aux utilisateurs de facilement comprendre les traces.----------ABSTRACT: Data storage is an essential resource for the computer industry. Storage devices must be fast and reliable to meet the growing demands of the data-driven economy. Storage technologies can be classified into two main categories: mass storage and main memory storage. Mass storage can store large amounts of data persistently. Data is saved locally on input/output devices, such as Hard Disk Drives (HDD) and Solid-State Drives (SSD), or remotely on distributed storage systems. Main memory storage temporarily holds the necessary data for running programs. Main memory is characterized by its high access speed, essential to quickly provide data to the Central Processing Unit (CPU). Operating systems use several mechanisms to manage storage devices, such as disk schedulers and memory allocators. The processing time of a storage request is affected by the interaction between several subsystems, which complicates the debugging task. Existing tools, such as benchmarking tools, provide a general idea of the overall system performance, but do not accurately identify the causes of poor performance. Dynamic analysis through execution tracing is a solution for the detailed runtime analysis of storage systems. Tracing collects precise data about the internal behavior of the system, which helps in detecting performance problems that are difficult to identify. The goal of this thesis is to provide a tool to analyze storage performance based on lowlevel trace events. The main challenges addressed by this tool are: collecting the required data using kernel and userspace tracing, limiting the overhead of tracing and the size of the generated traces, synchronizing the traces collected from different sources, providing multi-level analyses covering several aspects of storage performance, and lastly proposing abstractions allowing users to easily understand the traces. We carefully designed and inserted the instrumentation needed for the analyses. The tracepoints provide full visibility into the system and track the lifecycle of storage requests, from creation to processing. The Linux Trace Toolkit Next Generation (LTTng), a free and low-overhead tracer, is used for data collection. This tracer is characterized by its stability, and efficiency with highly parallel applications, thanks to the lock-free synchronization mechanisms used to update the content of the trace buffers. We also contributed to the creation of a patch that allows LTTng to capture the call stacks of userspace events

    Towards Meta-Level Engineering and Tooling for Complex Concurrent Systems

    Get PDF
    With the widespread use of multicore processors, software becomes more and more diverse in its use of parallel computing resources. To address all application requirements, each with the appropriate abstraction, developers mix and match various concurrency abstractions made available to them via libraries and frameworks. Unfortunately, today's tools such as debuggers and profilers do not support the diversity of these abstractions. Instead of enabling developers to reason about the high-level programming concepts, they used to express their programs, the tools work only on the library's implementation level. While this is a common problem also for other libraries and frameworks, the complexity of concurrency exacerbates the issue further, and reasoning on the higher levels of the concurrency abstractions is essential to manage the associated complexity. In this position paper, we identify open research issues and propose to build tools based on a common meta-level interface to enable developers to reasons about their programs based on the high-level concepts they used to implement them

    System and Application Performance Analysis Patterns Using Software Tracing

    Get PDF
    Software systems have become increasingly complex, which makes it difficult to detect the root causes of performance degradation. Software tracing has been used extensively to analyze the system at run-time to detect performance issues and uncover the causes. There exist several studies that use tracing and other dynamic analysis techniques for performance analysis. These studies focus on specific system characteristics such as latency, performance bugs, etc. In this thesis, we review the literature to build a catalogue of performance analysis patterns that can be detected using trace data. The goal is to help developers debug run-time and performance issues more efficiently. The patterns are formalized and implemented so that they can be readily referred to by developers while analyzing large execution traces. The thesis focuses on the traces of system calls generated by the Linux kernel. This is because no application is an island and that we cannot ignore the complex interactions that an application has with the operating system kernel if we are to detect potential performance issues

    An automated refactoring approach to improve IoT software quality

    Get PDF
    Internet of Things (IoT) software should provide good support for IoT devices as IoT devices are growing in quantity and complexity. Communication between IoT devices is largely realized in a concurrent way. How to ensure the correctness of concurrent access becomes a big challenge to IoT software development. This paper proposes a general refactoring framework for fine-grained read-write locking and implements an automatic refactoring tool to help developers convert built-in monitors into fine-grained ReentrantReadWriteLocks. Several program analysis techniques, such as visitor pattern analysis, alias analysis, and side-effect analysis, are used to assist with refactoring. Our tool is tested by several real-world applications including HSQLDB, Cassandra, JGroups, Freedomotic, and MINA. A total of 1072 built-in monitors are refactored into ReentrantReadWriteLocks. The experiments revealed that our tool can help developers with refactoring for ReentrantReadWriteLocks and save their time and energy.This research is supported by the Guangdong Province Key Research and Development Plan (2019B010137004), the National Key research and Development Plan (2018YEB1004003), the National Natural Science Foundation of China (U1636215,61871140,61872100), in part by the Scientific Research Foundation of Hebei Educational Department under Grant ZD2019093, in part by the Fundamental Research Foundation of Hebei Province under Grant 18960106D, and Guangdong Province Universities and Colleges Pearl River Scholar Funded Scheme (2019)

    C++ Design Patterns for Low-latency Applications Including High-frequency Trading

    Full text link
    This work aims to bridge the existing knowledge gap in the optimisation of latency-critical code, specifically focusing on high-frequency trading (HFT) systems. The research culminates in three main contributions: the creation of a Low-Latency Programming Repository, the optimisation of a market-neutral statistical arbitrage pairs trading strategy, and the implementation of the Disruptor pattern in C++. The repository serves as a practical guide and is enriched with rigorous statistical benchmarking, while the trading strategy optimisation led to substantial improvements in speed and profitability. The Disruptor pattern showcased significant performance enhancement over traditional queuing methods. Evaluation metrics include speed, cache utilisation, and statistical significance, among others. Techniques like Cache Warming and Constexpr showed the most significant gains in latency reduction. Future directions involve expanding the repository, testing the optimised trading algorithm in a live trading environment, and integrating the Disruptor pattern with the trading algorithm for comprehensive system benchmarking. The work is oriented towards academics and industry practitioners seeking to improve performance in latency-sensitive applications
    • …
    corecore