10 research outputs found

    Construction and Evaluation of Coordinated Performance Skeletons

    Get PDF
    Performance prediction is particularly challenging for dynamic foreign environments that cannot be modeled well, such as those involving resource sharing or foreign system components. Our approach is based on the concept of a performance skeleton which is a short running program whose execution time in any scenario reflects the estimated execution time of the application it represents. The fundamental technical challenge is automatic construction of performance skeletons for parallel MPI programs. The steps are 1) generation of process execution traces and conversion to a single coordinated logical program trace, 2) compression of the logical program trace, and 3) conversion to an executable parallel skeleton program. Results are presented to validate the construction methodology and prediction power of performance skeletons. The execution scenarios analyzed involve network sharing, different architectures and different MPI libraries. The emphasis is on identifying the strength and limitations of this approach to performanc

    Loop-based Modeling of Parallel Communication Traces

    Get PDF
    This paper describes an algorithm that takes a trace of a distributed program and builds a model of all communications of the program. The model is a set of nested loops representing repeated patterns. Loop bodies collect events representing communication actions performed by the various processes, like sending or receiving messages, and participating in collective operations. The model can be used for compact visualization of full executions, for program understanding and debugging, and also for building statistical analyzes of various quantitative aspects of the program's behavior. The construction of the communication model is performed in two phases. First, a local model is built for each process, capturing local regularities; this phase is incremental and fast, and can be done on-line, during the execution. The second phase is a reduction process that collects, aligns, and finally merges all local models into a global, system-wide model. This global model is a compact representation of all communications of the original program, capturing patterns across groups of processes. It can be visualized directly and, because it takes the form of a sequence of loop nests, can be used to replay the original program's communication actions. Because the model is based on communication events only, it completely ignores other quantitative aspects like timestamps or messages sizes. Including such data would in most case break regularities, reducing the usefulness of trace-based modeling. Instead, the paper shows how one can efficiently access quantitative data kept in the original trace(s), by annotating the model and compiling data scanners automatically.Ce rapport de recherche dĂ©crit un algorithme qui prend en entrĂ©e la trace d'un programme distribuĂ©, et construit un modĂšle de l'ensemble des communications du programme. Le modĂšle prend la forme d'un ensemble de boucles imbriquĂ©es reprĂ©sentant la rĂ©pĂ©tition de motifs de communication. Le corps des boucles regroupe des Ă©vĂ©nements reprĂ©sentant les actions de communication rĂ©alisĂ©es par les diffĂ©rents processus impliquĂ©s, tels que l'envoi et la rĂ©ception de messages, ou encore la participation Ă  des opĂ©rations collectives. Le modĂšle peut servir Ă  la visualisation compact d'exĂ©cutions complĂštes, Ă  la comprĂ©hension de programme et au debugging, mais Ă©galement Ă  la construction d'analyses statistiques de divers aspects quantitatifs du comportement du programme. La construction du modĂšle de communication s'effectue en deux phases. PremiĂšrement, un modĂšle local est construit au sein de chaque processus, capturant les rĂ©gularitĂ©s locales~; cette phase est incrĂ©mentale et rapide, et peut ĂȘtre rĂ©alisĂ©e au cours de l'exĂ©cution. La seconde phase est un processus de rĂ©duction qui rassemble, aligne, et finalement fusionne tous les modĂšles locaux en un modĂšle global dĂ©crivant la totalitĂ© du systĂšme. Ce modĂšle global est une reprĂ©sentation compacte de toutes les communications du programme original, reprĂ©sentant des motifs de communication entre groupes de processus. Il peut ĂȘtre visualisĂ© directement et, puisqu'il prend la forme d'un ensemble de nids de boucles, peut servir Ă  rejouer les opĂ©rations de communication du programme initial. Puisque le modĂšle construit se base uniquement sur les opĂ©rations de communication, il ignore complĂštement d'autres donnĂ©es quantitatives, telles que les informations chronologiques, ou les tailles de messages. L'inclusion de telles donnĂ©es briserait dans la plupart des cas les rĂ©gularitĂ©s topologiques, rĂ©duisant l'efficacitĂ© de la modĂ©lisation par boucles. Nous prĂ©fĂ©rons, dans ce rapport, montrer comment, grĂące au modĂšle construit, il est possible d'accĂ©der efficacement aux donnĂ©es quantitatives si celles-ci sont conservĂ©es dans les traces individuelles, en annotant le modĂšle et en l'utilisant pour compiler automatiquement des programmes d'accĂšs aux donnĂ©es

    Techniques To Facilitate the Understanding of Inter-process Communication Traces

    Get PDF
    High Performance Computing (HPC) systems play an important role in today’s heavily digitized world, which is in a constant demand for higher speed of calculation and performance. HPC applications are used in multiple domains such as telecommunication, health, scientific research, and more. With the emergence of multi-core and cloud computing platforms, the HPC paradigm is quickly becoming the design of choice of many service providers. HPC systems are also known to be complex to debug and analyze due to the large number of processes they involve and the way these processes communicate with each other to perform specific tasks. As a result, software engineers must spend extensive amount of time understanding the complex interactions among a system’s processes. This is usually done through the analysis of execution traces generated from running the system at hand. Traces, however, are very difficult to work with due to the overwhelming size of typical traces. The objective of this research is to present a set of techniques that facilitates the understanding of the behaviour of HPC applications through the analysis of system traces. The first technique consists of building an exchange format called MTF (MPI Trace Format) for representing and exchanging traces generated from HPC applications based on the MPI (Message Passing Interface) standard, which is a de facto standard for inter-process communication for high performance computing systems. The design of MTF is validated against well-known requirements for a standard exchange format. The second technique aims to facilitate the understanding of large traces of inter-process communication by automatically extracting communication patterns that characterize their main behaviour. Two algorithms are presented. The first one permits the recognition of repeating patterns in traces of MPI (Message Passing Interaction) applications whereas the second algorithm searches if a given communication pattern occurs in a trace. Both algorithms are based on the n-gram extraction technique used in natural language processing. Finally, we developed a technique to abstract MPI traces by detecting the different execution phases in a program based on concepts from information theory. Using this approach, software engineers can examine the trace as a sequence of high-level computational phases instead of a mere flow of low-level events. The techniques presented in this thesis have been tested on traces generated from real HPC programs. The results from several case studies demonstrate the usefulness and effectiveness of our techniques

    Automatic Energy Saving Schemes for Parallel Applications

    Get PDF
    Although high-performance computing traditionally focuses on the efficient execution of large-scale applications, both energy and power have become critical concerns when approaching exascale. Drastic increases in the power consumption of supercomputers affect significantly their operating costs and failure rates. In modern microprocessor architectures, equipped with dynamic voltage and frequency scaling (DVFS) and CPU clock modulation (throttling), the power consumption may be controlled in software. Additionally, network interconnect, such as Infiniband, may be exploited to maximize energy savings while the application performance loss and frequency switching overheads must be carefully balanced. This work first studies two important collective communication operations, all-to-all and allgather and proposes energy saving strategies on the per-call basis. Next, it targets point-to-point communications to group them into phases and apply frequency scaling to them to save energy by exploiting the architectural and communication stalls. Finally, it proposes an automatic runtime system which combines both collective and point-to-point communications into phases, and applies throttling to them apart from DVFS to maximize energy savings. The experimental results are presented for NAS parallel benchmark problems as well as for the realistic parallel electronic structure calculations performed by the widely used quantum chemistry package GAMESS. Close to the maximum energy savings were obtained with a substantially low performance loss on the given platform

    Cartographies of Copyright: Crisis & Propertization

    Get PDF
    Cartographies of Copyright is a cultural history of copyright that maps out various contradictions and tensions that give shape to the crisis of copyright and its relations to US music industries. More specifically, this work charts the radical dissension of copyright in recent history and argues for an understanding of the crisis as an internal transformative process. This formulation shifts the analytic approach from an abstract conceptual-legal perspective to a series of discrete points within a lived history, culture and materiality of copyright thought, audio technologies and neoliberal capitalism. Seen in terms of territorialization, the expansion of copyright via the music industries gives unique insight into the particular ways music and its media formats effect the evolution of copyright culture and law. When framed this way, an investigation into music and copyright leads to recognizing new forms of control, changing modes of administering access and contemporary relations of power. To chart the effects of these transformations I draw upon the tensions and problematics that constitute the Wu-Tang Clan’s 2015 one-of-a-kind release: Once Upon A Time In Shaolin. After accessing contentious foci within traditional copyright paradigms, I argue that classical models lack explanatory power, pose problems for understanding the transformations of copyright throughout history and fail to provide a comprehensive account of the present crisis. Taking inspiration from the reoccupation thesis presented by Hans Blumenberg and recent research in copyright I propose an alternative model rooted at the dialectic crux of property metaphors, audio technologies and formats, and neocapitalist commodity logic––all of which give shape to an internal transformation within copyright law and culture I term propertization. Cartographies of Copyright is not a legal treatise, per se; rather, the work sees law as text set within a larger social milieu liable to change and evolution. Drawing from legal theory, cultural studies and media studies the work engages copyright from the perspectives of critical history, rhetoric, media-materiality and speculative economics. The multidisciplinary approach also stakes out new formations to the problematics faced by music historically and to draw connections between the music industries and the broader social contexts they are situated in. Cartographies argues for a new lexicon to begin imagining alternative models to account for copyright’s transformations––ones better suited to imagining viable alternatives. It calls attention to the urgent need of public discussion concerning the role of copyright in contemporary music and culture and provides modest suggestions for thinkingSiirretty Doriast

    Theology of the Pain of God: An Analysis and Evaluation of Kazoh Kitamori\u27s (1916- ) Work in Japanese Protestantism

    Get PDF
    The overall negative appraisal of Kitamori must be considered as a critical reaction to various aspects of his own theology. Perhaps not one or two aspects of his theology alone are responsible for this. The reason for the negative responses is surely of a composite nature. However, it is not the main concern of this study to find answers individually to the questions raised above. The intention of this study is to analyze and assess Kitamori\u27s theology as a whole. But to try to find the answers to the questions is useful for the purpose here: they provide methodological clues

    Neurath Reconsidered: New Sources and Perspectives

    Get PDF

    Logicalization of Communication Traces from Parallel Execution

    No full text
    Abstract—Communication traces are integral to performance modeling and analysis of parallel programs. However, execution on a large number of nodes results in a large trace volume that is cumbersome and expensive to analyze. This paper presents an automatic framework to convert all process traces corresponding to the parallel execution of an SPMD MPI program into a single logical trace. First, the application communication matrix is generated from process traces. Next, topology identification is performed based on the underlying communication structure and independent of the way ranks (or numbers) are assigned to processes. Finally, message exchanges between physical processes are converted into logical message exchanges that represent similar message exchanges across all processes, resulting in a trace volume reduction approximately equal to the number of processes executing the application. This logicalization framework has been implemented and the results report on its performance and effectiveness. 1 I
    corecore