111 research outputs found

    Profiling of parallel programs in a non-strict functional language

    Get PDF
    [Abstract] Purely functional programming languages offer many benefits to parallel programming. The absence of side effects and the provision for higher-level abstractions eases the programming effort. In particular, nonstrict functional languages allow further separation of concerns and provide more parallel facilities in the form of semi-implicit parallelism. On the other hand, because the low-level details of the execution are hidden, usually in a runtime system, the process of debugging the performance of parallel applications becomes harder. Currently available parallel profiling tools allow programmers to obtain some information about the execution; however, this information is usually not detailed enough to precisely pinpoint the cause of some performance problems. Often, this is because the cost of obtaining that information would be prohibitive for a complete program execution. In this thesis, we design and implement a parallel profiling framework based on execution replay. This debugging technique makes it possible to simulate recorded executions of a program, ensuring that their behaviour remains unchanged. The novelty of our approach is to adapt this technique to the context of parallel profiling and to take advantage of the characteristics of non-strict purely functional semantics to guarantee minimal overhead in the recording process. Our work allows to build more powerful profiling tools that do not affect the parallel behaviour of the program in a meaningful way.We demonstrate our claims through a series of benchmarks and the study of two use cases.[Resumo] As linguaxes de programación funcional puras ofrecen moitos beneficios para a programación paralela. A ausencia de efectos secundarios e as abstraccións de alto nivel proporcionadas facilitan o esforzo de programación. En particular, as linguaxes de programación non estritas permiten unha maior separación de conceptos e proporcionan máis capacidades de paralelismo na forma de paralelismo semi-implícito. Por outra parte, debido a que os detalles de baixo nivel da execución están ocultos, xeralmente nun sistema de execución, o proceso de depuración do rendemento de aplicacións paralelas é máis difícil. As ferramentas de profiling dispoñibles hoxe en día permiten aos programadores obter certa información acerca da execución; non obstante, esta información non acostuma a ser o suficientemente detallada para determinar de maneira precisa a causa dalgúns problemas de rendemento. A miúdo, isto débese a que o custe de obter esa información sería prohibitivo para unha execución completa do programa. Nesta tese, deseñamos e implementamos unha plataforma de profiling paralelo baseada en execution replay. Esta técnica de depuración fai que sexa posible simular execucións previamente rexistradas, asegurando que o seu comportamento se manteña sen cambios. A novidade do noso enfoque é adaptar esta técnica para o contexto do profiling paralelo e aproveitar as características da semántica das linguaxes de programación funcional non estritas e puras para garantizar unha sobrecarga mínima na recolección das trazas de execución. O noso traballo permite construír ferramentas de profiling máis potentes que non afectan ao comportamento paralelo do programa de maneira significativa. Demostramos as nosas afirmacións nunha serie de benchmarks e no estudo de dous casos de uso.[Resumen]Los lenguajes de programación funcional puros ofrecen muchos beneficios para la programación paralela. La ausencia de efectos secundarios y las abstracciones de alto nivel proporcionadas facilitan el esfuerzo de programación. En particular, los lenguajes de programación no estrictos permiten una mayor separación de conceptos y proporcionan más capacidades de paralelismo en la forma de paralelismo semi-implícito. Por otra parte, debido a que los detalles de bajo nivel de la ejecución están ocultos, generalmente en un sistema de ejecución, el proceso de depuración del rendimiento de aplicaciones paralelas es más difícil. Las herramientas de profiling disponibles hoy en día permiten a los programadores obtener cierta información acerca de la ejecución; sin embargo, esta información no suele ser lo suficientemente detallada para determinar de manera precisa la causa de algunos problemas de rendimiento. A menudo, esto se debe a que el costo de obtener esa información sería prohibitivo para una ejecución completa del programa. En esta tesis, diseñamos e implementamos una plataforma de profiling paralelo baseada en execution replay. Esta técnica de depuración hace que sea posible simular ejecuciones previamente registradas, asegurando que su comportamiento se mantiene sin cambios. La novedad de nuestro enfoque es adaptar esta técnica para el contexto del profiling paralelo y aprovechar las características de la semántica de los lenguajes de programación funcional no estrictos y puros para garantizar una sobrecarga mínima en la recolección de las trazas de ejecución. Nuestro trabajo permite construir herramientas de profiling más potentes que no afectan el comportamiento paralelo del programa de manera significativa. Demostramos nuestras afirmaciones en una serie de benchmarks y en el estudio de dos casos de uso

    Design of a network filing system

    Get PDF

    Garbage collection in distributed systems

    Get PDF
    PhD ThesisThe provision of system-wide heap storage has a number of advantages. However, when the technique is applied to distributed systems automatically recovering inaccessible variables becomes a serious problem. This thesis presents a survey of such garbage collection techniques but finds that no existing algorithm is entirely suitable. A new, general purpose algorithm is developed and presented which allows individual systems to garbage collect largely independently. The effects of these garbage collections are combined, using recursively structured control mechanisms, to achieve garbage collection of the entire heap with the minimum of overheads. Experimental results show that new algorithm recovers most inaccessible variables more quickly than a straightforward garbage collection, giving an improved memory utilisation

    Performance analysis methods for understanding scaling bottlenecks in multi-threaded applications

    Get PDF
    In dit proefschrift stellen we drie nieuwe methodes voor om de prestatie van meerdradige programma's te analyseren. Onze eerste methode, criticality stacks, is bruikbaar voor het analyseren van onevenwicht tussen draden. Om deze stacks te construeren stellen we een nieuwe criticaliteitsmetriek voor, die de uitvoeringstijd van een applicatie opsplitst in een deel voor iedere draad. Hoe groter dit deel is voor een draad, hoe kritischer deze draad is voor de applicatie. De tweede methode, bottle graphs, stelt iedere draad van een meerdradig programma voor als een rechthoek in een grafiek. De hoogte van de rechthoek wordt berekend door middel van onze criticaliteitsmetriek, en de breedte stelt het parallellisme van een draad voor. Rechthoeken die bovenaan in de grafiek zitten, als het ware in de hals van de fles, hebben een beperkt parallellisme, waardoor we ze beschouwen als “bottlenecks” voor de applicatie. Onze derde methode, speedup stacks, toont de bereikte speedup van een applicatie en de verschillende componenten die speedup beperken in een gestapelde grafiek. De intuïtie achter dit concept is dat door het reduceren van de invloed van een bepaalde component, de speedup van een applicatie proportioneel toeneemt met de grootte van die component in de stapel

    Understanding the performance of interactive applications

    Get PDF
    Many if not most computer systems are used by human users. The performance of such interactive systems ultimately affects those users. Thus, when measuring, understanding, and improving system performance, it makes sense to consider the human user's perspective. Essentially, the performance of interactive applications is determined by the perceptible lag in handling user requests. So, when characterizing the runtime of an interactive application we need a new approach that focuses on the perceptible lags rather than on overall and general performance characteristics. Such a new characterization approach should enable a new way to profile and improve the performance of interactive applications. Imagine a way that would seek out these perceptible lags and then investigate the causes of these lags. Performance analysts could simply optimize responsible parts of the software, thus eliminating perceptible lag for interactive applications. Unfortunately, existing profiling approaches either incur significant overhead that makes them impractical for an interactive scenario, or they lack the ability to provide insight into the causes of long latencies. An effective approach for interactive applications has to fulfill several requirements such as an accurate view of the causes of performance problems and insignificant perturbation of the interactive application. We propose a new profiling approach that helps developers to understand and improve the perceptible performance of interactive applications and satisfies the above needs

    Statistiline lähenemine mälulekete tuvastamiseks Java rakendustes

    Get PDF
    Kaasaegsed hallatud käitusaja keskkonnad (ingl. managed runtime environment) ja programmeerimiskeeled lihtsustavad rakenduste loomist ning haldamist. Kõige levinumaks näiteks säärase keele ja keskkonna kohta on Java. Üheks tähtsaks hallatud käitusaja keskkonna ülesandeks on automaatne mäluhaldus. Vaatamata sisseehitatud prügikoristajale, mälulekke probleem Javas on endiselt relevantne ning tähendab tarbetut mälu hoidmist. Probleem on eriti kriitiline rakendustes mis peaksid ööpäevaringselt tõrgeteta toimima, kuna mäluleke on üks väheseid programmeerimisvigu mis võib hävitada kogu Java rakenduse. Parimaks indikaatoriks otsustamaks kas objekt on kasutuses või mitte on objekti viimane kasutusaeg. Selle meetrika põhiliseks puudujäägiks on selle hind jõudluse mõttes. Käesolev väitekiri uurib mälulekete problemaatikat Javas ning pakub välja uudse mälulekkeid tuvastava ning diagnoosiva algoritmi. Väitekirjas kirjeldatakse alternatiivset lähenemisviisi objektide kasutuse hindamiseks. Põhihüpoteesiks on idee et lekkivaid objekte saab statistiliste meetoditega eristada mittelekkivatest kui vaadelda objektide populatsiooni eluiga erinevate gruppide lõikes. Pakutud lähenemine on oluliselt odavama hinnaga jõudluse mõttes, kuna objekti kohta on vaja salvestada infot ainult selle loomise hetkel. Väitekirja uurimistöö tulemusi on rakendatud mälulekete tuvastamise tööriista Plumbr arendamisel, mida hetkel edukalt kasutatakse ka erinevates toodangkeskkondades. Pärast sissejuhatavaid peatükke, väitekirjas vaadeldakse siiani pakutud lahendusi ning on pakutud välja ka nende meetodite klassifikatsioon. Järgnevalt on kirjeldatud statistiline baasmeetod mälulekete tuvastamiseks. Lisaks on analüüsitud ka kirjeldatud baasmeetodi puudujääke. Järgnevalt on kirjeldatud kuidas said defineeritud lisamõõdikud mis aitasid masinõppe abil baasmeetodit täpsemaks teha. Testandmeid masinõppe tarbeks on kogutud Plumbri abil päris rakendustest ning toodangkeskkondadest. Lisaks, kirjeldatakse väitekirjas juhtumianalüüse ning võrdlust ühe olemasoleva mälulekete tuvastamise lahendusega.Modern managed runtime environments and programming languages greatly simplify creation and maintenance of applications. One of the best examples of such managed runtime environments and a language is the Java Virtual Machine and the Java programming language. Despite the built in garbage collector, the memory leak problem is still relevant in Java and means wasting memory by preventing unused objects from being removed. The problem of memory leaks is especially critical for applications, which are expected to work uninterrupted around the clock, as running out of memory is one of a few reasons which may cause the termination of the whole Java application. The best indicator of whether an object is used or not is the time of the last access. However, the main disadvantage of this metric is the incurred performance overhead. Current thesis researches the memory leak problem and proposes a novel approach for memory leak detection and diagnosis. The thesis proposes an alternative approach for estimation of the 'unusedness' of objects. The main hypothesis is that leaked objects may be identified by applying statistical methods to analyze lifetimes of objects, by observing the ages of the population of objects grouped by their allocation points. Proposed solution is much more efficient performance-wise as for each object it is sufficient to record any information at the time of creation of the object. The research conducted for the thesis is utilized in a memory leak detection tool Plumbr. After the introduction and overview of the state of the art, current thesis reviews existing solutions and proposes the classification for memory leak detection approaches. Next, the statistical approach for memory leak detection is described along with the description of the main metric used to distinguish leaking objects from non-leaking ones. Follows the analysis of this single metric. Based on this analysis additional metrics are designed and machine learning algorithms are applied on the statistical data acquired from real production environments from the Plumbr tool. Case studies of real applications and one previous solution for the memory leak detection are performed in order to evaluate performance overhead of the tool

    A VLSI architecture for enhancing software reliability

    Get PDF
    As a solution to the software crisis, we propose an architecture that supports and encourages the use of programming techniques and mechanisms for enhancing software reliability. The proposed architecture provides efficient mechanisms for detecting a wide variety of run-time errors, for supporting data abstraction, module-based programming and encourages the use of small protection domains through a highly efficient capability mechanism. The proposed architecture also provides efficient support for user-specified exception handlers and both event-driven and trace-driven debugging mechanisms. The shortcomings of the existing capability-based architectures that were designed with a similar goal in mind are examined critically to identify their problems with regard to capability translation, domain switching, storage management, data abstraction and interprocess communication. Assuming realistic VLSI implementation constraints, an instruction set for the proposed architecture is designed. Performance estimates of the proposed system are then made from the microprograms corresponding to these instructions based on observed characteristics of similar systems and language usage. A comparison of the proposed architecture with similar ones, both in terms of functional characteristics and low-level performance indicates the proposed design to be superior
    corecore