88 research outputs found

    Statistiline lähenemine mälulekete tuvastamiseks Java rakendustes

    Get PDF
    Kaasaegsed hallatud käitusaja keskkonnad (ingl. managed runtime environment) ja programmeerimiskeeled lihtsustavad rakenduste loomist ning haldamist. Kõige levinumaks näiteks säärase keele ja keskkonna kohta on Java. Üheks tähtsaks hallatud käitusaja keskkonna ülesandeks on automaatne mäluhaldus. Vaatamata sisseehitatud prügikoristajale, mälulekke probleem Javas on endiselt relevantne ning tähendab tarbetut mälu hoidmist. Probleem on eriti kriitiline rakendustes mis peaksid ööpäevaringselt tõrgeteta toimima, kuna mäluleke on üks väheseid programmeerimisvigu mis võib hävitada kogu Java rakenduse. Parimaks indikaatoriks otsustamaks kas objekt on kasutuses või mitte on objekti viimane kasutusaeg. Selle meetrika põhiliseks puudujäägiks on selle hind jõudluse mõttes. Käesolev väitekiri uurib mälulekete problemaatikat Javas ning pakub välja uudse mälulekkeid tuvastava ning diagnoosiva algoritmi. Väitekirjas kirjeldatakse alternatiivset lähenemisviisi objektide kasutuse hindamiseks. Põhihüpoteesiks on idee et lekkivaid objekte saab statistiliste meetoditega eristada mittelekkivatest kui vaadelda objektide populatsiooni eluiga erinevate gruppide lõikes. Pakutud lähenemine on oluliselt odavama hinnaga jõudluse mõttes, kuna objekti kohta on vaja salvestada infot ainult selle loomise hetkel. Väitekirja uurimistöö tulemusi on rakendatud mälulekete tuvastamise tööriista Plumbr arendamisel, mida hetkel edukalt kasutatakse ka erinevates toodangkeskkondades. Pärast sissejuhatavaid peatükke, väitekirjas vaadeldakse siiani pakutud lahendusi ning on pakutud välja ka nende meetodite klassifikatsioon. Järgnevalt on kirjeldatud statistiline baasmeetod mälulekete tuvastamiseks. Lisaks on analüüsitud ka kirjeldatud baasmeetodi puudujääke. Järgnevalt on kirjeldatud kuidas said defineeritud lisamõõdikud mis aitasid masinõppe abil baasmeetodit täpsemaks teha. Testandmeid masinõppe tarbeks on kogutud Plumbri abil päris rakendustest ning toodangkeskkondadest. Lisaks, kirjeldatakse väitekirjas juhtumianalüüse ning võrdlust ühe olemasoleva mälulekete tuvastamise lahendusega.Modern managed runtime environments and programming languages greatly simplify creation and maintenance of applications. One of the best examples of such managed runtime environments and a language is the Java Virtual Machine and the Java programming language. Despite the built in garbage collector, the memory leak problem is still relevant in Java and means wasting memory by preventing unused objects from being removed. The problem of memory leaks is especially critical for applications, which are expected to work uninterrupted around the clock, as running out of memory is one of a few reasons which may cause the termination of the whole Java application. The best indicator of whether an object is used or not is the time of the last access. However, the main disadvantage of this metric is the incurred performance overhead. Current thesis researches the memory leak problem and proposes a novel approach for memory leak detection and diagnosis. The thesis proposes an alternative approach for estimation of the 'unusedness' of objects. The main hypothesis is that leaked objects may be identified by applying statistical methods to analyze lifetimes of objects, by observing the ages of the population of objects grouped by their allocation points. Proposed solution is much more efficient performance-wise as for each object it is sufficient to record any information at the time of creation of the object. The research conducted for the thesis is utilized in a memory leak detection tool Plumbr. After the introduction and overview of the state of the art, current thesis reviews existing solutions and proposes the classification for memory leak detection approaches. Next, the statistical approach for memory leak detection is described along with the description of the main metric used to distinguish leaking objects from non-leaking ones. Follows the analysis of this single metric. Based on this analysis additional metrics are designed and machine learning algorithms are applied on the statistical data acquired from real production environments from the Plumbr tool. Case studies of real applications and one previous solution for the memory leak detection are performed in order to evaluate performance overhead of the tool

    Adonis: Practical and Efficient Control Flow Recovery through OS-Level Traces

    Get PDF
    Control flow recovery is critical to promise the software quality, especially for large-scale software in production environment. However, the efficiency of most current control flow recovery techniques is compromised due to their runtime overheads along with deployment and development costs. To tackle this problem, we propose a novel solution, Adonis, which harnesses OS-level traces, such as dynamic library calls and system call traces, to efficiently and safely recover control flows in practice. Adonis operates in two steps: it first identifies the call-sites of trace entries, then it executes a pair-wise symbolic execution to recover valid execution paths. This technique has several advantages. First, Adonis does not require the insertion of any probes into existing applications, thereby minimizing runtime cost. Second, given that OS-level traces are hardware-independent, Adonis can be implemented across various hardware configurations without the need for hardware-specific engineering efforts, thus reducing deployment cost. Third, as Adonis is fully automated and does not depend on manually created logs, it circumvents additional development cost. We conducted an evaluation of Adonis on representative desktop applications and real-world IoT applications. Adonis can faithfully recover the control flow with 86.8% recall and 81.7% precision. Compared to the state-of-the-art log-based approach, Adonis can not only cover all the execution paths recovered, but also recover 74.9% of statements that cannot be covered. In addition, the runtime cost of Adonis is 18.3× lower than the instrument-based approach; the analysis time and storage cost (indicative of the deployment cost) of Adonis is 50× smaller and 443× smaller than the hardware-based approach, respectively. To facilitate future replication and extension of this work, we have made the code and data publicly available

    Performance Benchmarking of Application Monitoring Frameworks

    Get PDF
    Application-level monitoring of continuously operating software systems provides insights into their dynamic behavior, helping to maintain their performance and availability during runtime. Such monitoring may cause a significant runtime overhead to the monitored system, depending on the number and location of used instrumentation probes. In order to improve a system’s instrumentation and to reduce the caused monitoring overhead, it is necessary to know the performance impact of each probe. While many monitoring frameworks are claiming to have minimal impact on the performance, these claims are often not backed up with a detailed performance evaluation determining the actual cost of monitoring. Benchmarks can be used as an effective and affordable way for these evaluations. However, no benchmark specifically targeting the overhead of monitoring itself exists. Furthermore, no established benchmark engineering methodology exists that provides guidelines for the design, execution, and analysis of benchmarks. This thesis introduces a benchmark approach to measure the performance overhead of application-level monitoring frameworks. The core contributions of this approach are 1) a definition of common causes of monitoring overhead, 2) a general benchmark engineering methodology, 3) the MooBench micro-benchmark to measure and quantify causes of monitoring overhead, and 4) detailed performance evaluations of three different application-level monitoring frameworks. Extensive experiments demonstrate the feasibility and practicality of the approach and validate the benchmark results. The developed benchmark is available as open source software and the results of all experiments are available for download to facilitate further validation and replication of the results

    Augmenting IDEs with Runtime Information for Software Maintenance

    Get PDF
    Object-oriented language features such as inheritance, abstract types, late-binding, or polymorphism lead to distributed and scattered code, rendering a software system hard to understand and maintain. The integrated development environment (IDE), the primary tool used by developers to maintain software systems, usually purely operates on static source code and does not reveal dynamic relationships between distributed source artifacts, which makes it difficult for developers to understand and navigate software systems. Another shortcoming of today's IDEs is the large amount of information with which they typically overwhelm developers. Large software systems encompass several thousand source artifacts such as classes and methods. These static artifacts are presented by IDEs in views such as trees or source editors. To gain an understanding of a system, developers have to open many such views, which leads to a workspace cluttered with different windows or tabs. Navigating through the code or maintaining a working context is thus difficult for developers working on large software systems. In this dissertation we address the question how to augment IDEs with dynamic information to better navigate scattered code while at the same time not overwhelming developers with even more information in the IDE views. We claim that by first reducing the amount of information developers have to deal with, we are subsequently able to embed dynamic information in the familiar source perspectives of IDEs to better comprehend and navigate large software spaces. We propose means to reduce or mitigate the information by highlighting relevant source elements, by explicitly representing working context, and by automatically housekeeping the workspace in the IDE. We then improve navigation of scattered code by explicitly representing dynamic collaboration and software features in the static source perspectives of IDEs. We validate our claim by conducting empirical experiments with developers and by analyzing recorded development sessions

    Supporting Source Code Feature Analysis Using Execution Trace Mining

    Get PDF
    Software maintenance is a significant phase of a software life-cycle. Once a system is developed the main focus shifts to maintenance to keep the system up to date. A system may be changed for various reasons such as fulfilling customer requirements, fixing bugs or optimizing existing code. Code needs to be studied and understood before any modification is done to it. Understanding code is a time intensive and often complicated part of software maintenance that is supported by documentation and various tools such as profilers, debuggers and source code analysis techniques. However, most of the tools fail to assist in locating the portions of the code that implement the functionality the software developer is focusing. Mining execution traces can help developers identify parts of the source code specific to the functionality of interest and at the same time help them understand the behaviour of the code. We propose a use-driven hybrid framework of static and dynamic analyses to mine and manage execution traces to support software developers in understanding how the system's functionality is implemented through feature analysis. We express a system's use as a set of tests. In our approach, we develop a set of uses that represents how a system is used or how a user uses some specific functionality. Each use set describes a user's interaction with the system. To manage large and complex traces we organize them by system use and segment them by user interface events. The segmented traces are also clustered based on internal and external method types. The clusters are further categorized into groups based on application programming interfaces and active clones. To further support comprehension we propose a taxonomy of metrics which are used to quantify the trace. To validate the framework we built a tool called TrAM that implements trace mining and provides visualization features. It can quantify the trace method information, mine similar code fragments called active clones, cluster methods based on types, categorise them based on groups and quantify their behavioural aspects using a set of metrics. The tool also lets the users visualize the design and implementation of a system using images, filtering, grouping, event and system use, and present them with values calculated using trace, group, clone and method metrics. We also conducted a case study on five different subject systems using the tool to determine the dynamic properties of the source code clones at runtime and answer three research questions using our findings. We compared our tool with trace mining tools and profilers in terms of features, and scenarios. Finally, we evaluated TrAM by conducting a user study on its effectiveness, usability and information management

    Simulating and analyzing commercial workloads and computer systems

    Get PDF

    Profilage continu et efficient de verrous pour Java pour les architectures multicœurs

    Get PDF
    Today, the processing of large dataset is generally parallelised and performed on computers with many cores. However, locks can serialize the execution of these cores and hurt the latency and the processing throughput. Spotting theses lock contention issues in-vitro (i.e. during the development phase) is complex because it is difficult to reproduce a production environment, to create a realistic workload representative of the context of use of the software and to test every possible configuration of deployment where will be executed the software. This thesis introduces Free Lunch, a lock profiler that diagnoses phases of high lock contention due to locks in-vivo (i.e. during the operational phase). Free Lunch is designed around a new metric, the Critical Section Pressure (CSP), which aims to evaluate the impact of lock contention on overall thread progress. Free Lunch is integrated in Hotpost in order to minimize the overhead and regularly reports the CSP during the execution in order to detect temporary issues due to locks. Free Lunch is evaluated over 31 benchmarks from Dacapo 9.12, SpecJVM08 and SpecJBB2005, and over the Cassandra database. We were able to pinpoint the phases of lock contention in 6 applications for which some of these were not detected by existing profilers. With this information, we have improved the performance of Xalan by 15% just by rewriting one line of code and identified a phase of high lock contention in Cassandra during the replay of transactions after a crash of a node. Free Lunch has never degraded performance by more than 6%, which makes it suitable to be deployed continuously in an operational environment.Aujourd’hui, le traitement de grands jeux de données est généralement parallélisé et effectué sur des machines multi-cœurs. Cependant, les verrous peuvent sérialiser l'exécution de ces coeurs et dégrader la latence et le débit du traitement. Détecter ces problèmes de contention de verrous in-vitro (i.e. pendant le développement du logiciel) est complexe car il est difficile de reproduire un environnement de production, de créer une charge de travail réaliste représentative du contexte d’utilisation du logiciel et de tester toutes les configurations de déploiement possibles où s'exécutera le logiciel. Cette thèse présente Free Lunch, un profiler permettant d'identifier les phases de contention dues aux verrous in-vivo (i.e. en production). Free Lunch intègre une nouvelle métrique appelée Critical Section Pressure (CSP) évaluant avec précision l'impact de la synchronisation sur le progrès des threads. Free Lunch est directement intégré dans la JVM Hotspot pour minimiser le surcoût d'exécution et reporte régulièrement la CSP afin de pouvoir détecter les problèmes transitoires dus aux verrous. Free Lunch est évalué sur 31 benchmarks issus de Dacapo 9.12, SpecJVM08 et SpecJBB2005, ainsi que sur la base de données Cassandra. Nous avons identifié des phases de contention dans 6 applications dont certaines n'étaient pas détectées par les profilers actuels. Grâce à ces informations, nous avons amélioré la performance de Xalan de 15% en modifiant une seule ligne de code et identifié une phase de haute contention dans Cassandra. Free Lunch n’a jamais dégradé les performances de plus de 6% ce qui le rend approprié pour être déployé continuellement dans un environnement de production

    Profilage continu et efficient de verrous pour Java pour les architectures multicœurs

    Get PDF
    Today, the processing of large dataset is generally parallelised and performed on computers with many cores. However, locks can serialize the execution of these cores and hurt the latency and the processing throughput. Spotting theses lock contention issues in-vitro (i.e. during the development phase) is complex because it is difficult to reproduce a production environment, to create a realistic workload representative of the context of use of the software and to test every possible configuration of deployment where will be executed the software. This thesis introduces Free Lunch, a lock profiler that diagnoses phases of high lock contention due to locks in-vivo (i.e. during the operational phase). Free Lunch is designed around a new metric, the Critical Section Pressure (CSP), which aims to evaluate the impact of lock contention on overall thread progress. Free Lunch is integrated in Hotpost in order to minimize the overhead and regularly reports the CSP during the execution in order to detect temporary issues due to locks. Free Lunch is evaluated over 31 benchmarks from Dacapo 9.12, SpecJVM08 and SpecJBB2005, and over the Cassandra database. We were able to pinpoint the phases of lock contention in 6 applications for which some of these were not detected by existing profilers. With this information, we have improved the performance of Xalan by 15% just by rewriting one line of code and identified a phase of high lock contention in Cassandra during the replay of transactions after a crash of a node. Free Lunch has never degraded performance by more than 6%, which makes it suitable to be deployed continuously in an operational environment.Aujourd’hui, le traitement de grands jeux de données est généralement parallélisé et effectué sur des machines multi-cœurs. Cependant, les verrous peuvent sérialiser l'exécution de ces coeurs et dégrader la latence et le débit du traitement. Détecter ces problèmes de contention de verrous in-vitro (i.e. pendant le développement du logiciel) est complexe car il est difficile de reproduire un environnement de production, de créer une charge de travail réaliste représentative du contexte d’utilisation du logiciel et de tester toutes les configurations de déploiement possibles où s'exécutera le logiciel. Cette thèse présente Free Lunch, un profiler permettant d'identifier les phases de contention dues aux verrous in-vivo (i.e. en production). Free Lunch intègre une nouvelle métrique appelée Critical Section Pressure (CSP) évaluant avec précision l'impact de la synchronisation sur le progrès des threads. Free Lunch est directement intégré dans la JVM Hotspot pour minimiser le surcoût d'exécution et reporte régulièrement la CSP afin de pouvoir détecter les problèmes transitoires dus aux verrous. Free Lunch est évalué sur 31 benchmarks issus de Dacapo 9.12, SpecJVM08 et SpecJBB2005, ainsi que sur la base de données Cassandra. Nous avons identifié des phases de contention dans 6 applications dont certaines n'étaient pas détectées par les profilers actuels. Grâce à ces informations, nous avons amélioré la performance de Xalan de 15% en modifiant une seule ligne de code et identifié une phase de haute contention dans Cassandra. Free Lunch n’a jamais dégradé les performances de plus de 6% ce qui le rend approprié pour être déployé continuellement dans un environnement de production

    Aspect structure of compilers

    Get PDF
    Compilers are among the most widely-studied pieces of software; and, modularizing these valuable artifacts is a recurring theme in research. However, modularization of cross-cutting concerns in compilers is not yet well explored. Even today, implementation of one compiler concern scatters across and tangles with the implementation of several other concerns, thereby leading to a mismatch between different compiler modules and the operations they represent. Essentially, current compiler implementations fail to explicitly identify the control dependencies of different phases, and separately characterize the actions to execute during those phases. As a result, information about their program-execution path remains non-intuitive: it stays hidden within the program structure and cuts-across several phase implementations. Consequently, this makes compiler designs and artifacts difficult to comprehend, maintain and reuse. Such limitations occur primarily as a result of the inability of mainstream object-oriented languages, such as Java, to organize the cross-cutting concerns into clean modular units. This thesis demonstrates how such modularity-issues in compilers can be addressed with the help of a relatively new, yet powerful programming paradigm called aspect-oriented programming
    corecore