9,466 research outputs found

    Statistiline lähenemine mälulekete tuvastamiseks Java rakendustes

    Get PDF
    Kaasaegsed hallatud käitusaja keskkonnad (ingl. managed runtime environment) ja programmeerimiskeeled lihtsustavad rakenduste loomist ning haldamist. Kõige levinumaks näiteks säärase keele ja keskkonna kohta on Java. Üheks tähtsaks hallatud käitusaja keskkonna ülesandeks on automaatne mäluhaldus. Vaatamata sisseehitatud prügikoristajale, mälulekke probleem Javas on endiselt relevantne ning tähendab tarbetut mälu hoidmist. Probleem on eriti kriitiline rakendustes mis peaksid ööpäevaringselt tõrgeteta toimima, kuna mäluleke on üks väheseid programmeerimisvigu mis võib hävitada kogu Java rakenduse. Parimaks indikaatoriks otsustamaks kas objekt on kasutuses või mitte on objekti viimane kasutusaeg. Selle meetrika põhiliseks puudujäägiks on selle hind jõudluse mõttes. Käesolev väitekiri uurib mälulekete problemaatikat Javas ning pakub välja uudse mälulekkeid tuvastava ning diagnoosiva algoritmi. Väitekirjas kirjeldatakse alternatiivset lähenemisviisi objektide kasutuse hindamiseks. Põhihüpoteesiks on idee et lekkivaid objekte saab statistiliste meetoditega eristada mittelekkivatest kui vaadelda objektide populatsiooni eluiga erinevate gruppide lõikes. Pakutud lähenemine on oluliselt odavama hinnaga jõudluse mõttes, kuna objekti kohta on vaja salvestada infot ainult selle loomise hetkel. Väitekirja uurimistöö tulemusi on rakendatud mälulekete tuvastamise tööriista Plumbr arendamisel, mida hetkel edukalt kasutatakse ka erinevates toodangkeskkondades. Pärast sissejuhatavaid peatükke, väitekirjas vaadeldakse siiani pakutud lahendusi ning on pakutud välja ka nende meetodite klassifikatsioon. Järgnevalt on kirjeldatud statistiline baasmeetod mälulekete tuvastamiseks. Lisaks on analüüsitud ka kirjeldatud baasmeetodi puudujääke. Järgnevalt on kirjeldatud kuidas said defineeritud lisamõõdikud mis aitasid masinõppe abil baasmeetodit täpsemaks teha. Testandmeid masinõppe tarbeks on kogutud Plumbri abil päris rakendustest ning toodangkeskkondadest. Lisaks, kirjeldatakse väitekirjas juhtumianalüüse ning võrdlust ühe olemasoleva mälulekete tuvastamise lahendusega.Modern managed runtime environments and programming languages greatly simplify creation and maintenance of applications. One of the best examples of such managed runtime environments and a language is the Java Virtual Machine and the Java programming language. Despite the built in garbage collector, the memory leak problem is still relevant in Java and means wasting memory by preventing unused objects from being removed. The problem of memory leaks is especially critical for applications, which are expected to work uninterrupted around the clock, as running out of memory is one of a few reasons which may cause the termination of the whole Java application. The best indicator of whether an object is used or not is the time of the last access. However, the main disadvantage of this metric is the incurred performance overhead. Current thesis researches the memory leak problem and proposes a novel approach for memory leak detection and diagnosis. The thesis proposes an alternative approach for estimation of the 'unusedness' of objects. The main hypothesis is that leaked objects may be identified by applying statistical methods to analyze lifetimes of objects, by observing the ages of the population of objects grouped by their allocation points. Proposed solution is much more efficient performance-wise as for each object it is sufficient to record any information at the time of creation of the object. The research conducted for the thesis is utilized in a memory leak detection tool Plumbr. After the introduction and overview of the state of the art, current thesis reviews existing solutions and proposes the classification for memory leak detection approaches. Next, the statistical approach for memory leak detection is described along with the description of the main metric used to distinguish leaking objects from non-leaking ones. Follows the analysis of this single metric. Based on this analysis additional metrics are designed and machine learning algorithms are applied on the statistical data acquired from real production environments from the Plumbr tool. Case studies of real applications and one previous solution for the memory leak detection are performed in order to evaluate performance overhead of the tool

    Implicit Smartphone User Authentication with Sensors and Contextual Machine Learning

    Full text link
    Authentication of smartphone users is important because a lot of sensitive data is stored in the smartphone and the smartphone is also used to access various cloud data and services. However, smartphones are easily stolen or co-opted by an attacker. Beyond the initial login, it is highly desirable to re-authenticate end-users who are continuing to access security-critical services and data. Hence, this paper proposes a novel authentication system for implicit, continuous authentication of the smartphone user based on behavioral characteristics, by leveraging the sensors already ubiquitously built into smartphones. We propose novel context-based authentication models to differentiate the legitimate smartphone owner versus other users. We systematically show how to achieve high authentication accuracy with different design alternatives in sensor and feature selection, machine learning techniques, context detection and multiple devices. Our system can achieve excellent authentication performance with 98.1% accuracy with negligible system overhead and less than 2.4% battery consumption.Comment: Published on the IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) 2017. arXiv admin note: substantial text overlap with arXiv:1703.0352

    Improving efficiency and resilience in large-scale computing systems through analytics and data-driven management

    Full text link
    Applications running in large-scale computing systems such as high performance computing (HPC) or cloud data centers are essential to many aspects of modern society, from weather forecasting to financial services. As the number and size of data centers increase with the growing computing demand, scalable and efficient management becomes crucial. However, data center management is a challenging task due to the complex interactions between applications, middleware, and hardware layers such as processors, network, and cooling units. This thesis claims that to improve robustness and efficiency of large-scale computing systems, significantly higher levels of automated support than what is available in today's systems are needed, and this automation should leverage the data continuously collected from various system layers. Towards this claim, we propose novel methodologies to automatically diagnose the root causes of performance and configuration problems and to improve efficiency through data-driven system management. We first propose a framework to diagnose software and hardware anomalies that cause undesired performance variations in large-scale computing systems. We show that by training machine learning models on resource usage and performance data collected from servers, our approach successfully diagnoses 98% of the injected anomalies at runtime in real-world HPC clusters with negligible computational overhead. We then introduce an analytics framework to address another major source of performance anomalies in cloud data centers: software misconfigurations. Our framework discovers and extracts configuration information from cloud instances such as containers or virtual machines. This is the first framework to provide comprehensive visibility into software configurations in multi-tenant cloud platforms, enabling systematic analysis for validating the correctness of software configurations. This thesis also contributes to the design of robust and efficient system management methods that leverage continuously monitored resource usage data. To improve performance under power constraints, we propose a workload- and cooling-aware power budgeting algorithm that distributes the available power among servers and cooling units in a data center, achieving up to 21% improvement in throughput per Watt compared to the state-of-the-art. Additionally, we design a network- and communication-aware HPC workload placement policy that reduces communication overhead by up to 30% in terms of hop-bytes compared to existing policies.2019-07-02T00:00:00
    corecore