16 research outputs found

    Automatic Adaption of the Sampling Frequency for Detailed Performance Analysis

    Get PDF
    One of the most urgent challenges in event based performance analysis is the enormous amount of collected data. Combining event tracing and periodic sampling has been a successful approach to allow a detailed event-based recording of MPI communication and a coarse recording of the remaining application with periodic sampling. In this paper, we present a novel approach to automatically adapt the sampling frequency during runtime to the given amount of buffer space, releasing users to find an appropriate sampling frequency themselves. This way, the entire measurement can be kept within a single memory buffer, which avoids disruptive intermediate memory buffer flushes, excessive data volumes, and measurement delays due to slow file system interaction. We describe our approach to sort and store samples based on their order of occurrence in an hierarchical array based on powers of two. Furthermore, we evaluate the feasibility as well as the overhead of the approach with the prototype implementation OTFX based on the Open Trace Format 2, a state-of-the-art Open Source event trace library used by the performance analysis tools Vampir, Scalasca, and Tau.This work is supported by the Spanish Ministry of Economy and Competitiveness under contract TIN2015-65316-P.Peer ReviewedPostprint (author's final draft

    Technique utilisation and efficiency in competitive Brazilian Jiu-Jitsu matches at white and blue belts

    Get PDF
    Despite its increasing popularity, little is known about Brazilian Jiu- Jitsu and what makes a successful fighter. This work aims to start answering questions about the most used and successful techniques to allow development of coaching methods towards enhancing performance at lower belt levels. One-hundred and forty tournament fights were analysed. The most common takedown was guardpull with 94% success. Significantly more single leg takedowns were attempted for blue belts (BBs), compared to white belts (WBs) (p = 0.013). However, there was no significant difference in success (p = 0.150). WBs used three main guardpasses with 93% covering knee slice, knee pin and bullfighter. A greater variety of passes were witnessed at BB with 71% coming from these three passes. The four most commonly attempted guard sweeps were scissor sweep, back take, Xguard sweep and SitUp sweep all experiencing varying levels of success: 55% for the scissor sweep, 60% back take, 63% Xguard sweep and 38% for the sit up sweep. Of all the submissions attempted 34% were for arm bar, 21% triangle, 12% cross collar choke but there were almost an inverse relationship between use and success with the least used having higher success rates, demonstrating that variety in submissions could lead to greater chances of success

    Concepts for In-memory Event Tracing: Runtime Event Reduction with Hierarchical Memory Buffers

    Get PDF
    This thesis contributes to the field of performance analysis in High Performance Computing with new concepts for in-memory event tracing. Event tracing records runtime events of an application and stores each with a precise time stamp and further relevant metrics. The high resolution and detailed information allows an in-depth analysis of the dynamic program behavior, interactions in parallel applications, and potential performance issues. For long-running and large-scale parallel applications, event-based tracing faces three challenges, yet unsolved: the number of resulting trace files limits scalability, the huge amounts of collected data overwhelm file systems and analysis capabilities, and the measurement bias, in particular, due to intermediate memory buffer flushes prevents a correct analysis. This thesis proposes concepts for an in-memory event tracing workflow. These concepts include new enhanced encoding techniques to increase memory efficiency and novel strategies for runtime event reduction to dynamically adapt trace size during runtime. An in-memory event tracing workflow based on these concepts meets all three challenges: First, it not only overcomes the scalability limitations due to the number of resulting trace files but eliminates the overhead of file system interaction altogether. Second, the enhanced encoding techniques and event reduction lead to remarkable smaller trace sizes. Finally, an in-memory event tracing workflow completely avoids intermediate memory buffer flushes, which minimizes measurement bias and allows a meaningful performance analysis. The concepts further include the Hierarchical Memory Buffer data structure, which incorporates a multi-dimensional, hierarchical ordering of events by common metrics, such as time stamp, calling context, event class, and function call duration. This hierarchical ordering allows a low-overhead event encoding, event reduction and event filtering, as well as new hierarchy-aided analysis requests. An experimental evaluation based on real-life applications and a detailed case study underline the capabilities of the concepts presented in this thesis. The new enhanced encoding techniques reduce memory allocation during runtime by a factor of 3.3 to 7.2, while at the same do not introduce any additional overhead. Furthermore, the combined concepts including the enhanced encoding techniques, event reduction, and a new filter based on function duration within the Hierarchical Memory Buffer remarkably reduce the resulting trace size up to three orders of magnitude and keep an entire measurement within a single fixed-size memory buffer, while still providing a coarse but meaningful analysis of the application. This thesis includes a discussion of the state-of-the-art and related work, a detailed presentation of the enhanced encoding techniques, the event reduction strategies, the Hierarchical Memory Buffer data structure, and a extensive experimental evaluation of all concepts

    ZIH-Info

    Get PDF
    - Abschaltung/Migration Mailinglisten-Server RKS15 - Abschaltung Mailrelay-Server - Inbetriebnahme eines Daten-Gateways - Data Analytics für Industrie 4.0 - ZIH-Kolloquium - Lange Nacht der Wissenschaften 2017 - ZIH auf der ISC'17 - ZIH-Publikationen - Veranstaltunge

    Evidence-enabled verification for the Linux kernel

    Get PDF
    Formal verification of large software has been an elusive target, riddled with problems of low accuracy and high computational complexity. With growing dependence on software in embedded and cyber-physical systems where vulnerabilities and malware can lead to disasters, an efficient and accurate verification has become a crucial need. The verification should be rigorous, computationally efficient, and automated enough to keep the human effort within reasonable limits, but it does not have to be completely automated. The automation should actually enable and simplify human cross-checking which is especially important when the stakes are high. Unfortunately, formal verification methods work mostly as automated black boxes with very little support for cross-checking. This thesis is about a different way to approach the software verification problem. It is about creating a powerful fusion of automation and human intelligence by incorporating algorithmic innovations to address the major challenges to advance the state of the art for accurate and scalable software verification where complete automation has remained intractable. The key is a mathematically rigorous notion of verification-critical evidence that the machine abstracts from software to empower human to reason with. The algorithmic innovation is to discover the patterns the developers have applied to manage complexity and leverage them. A pattern-based verification is crucial because the problem is intractable otherwise. We call the overall approach Evidence-Enabled Verification (EEV). This thesis presents the EEV with two challenging applications: (1) EEV for Lock/Unlock Pairing to verify the correct pairing of mutex lock and spin lock with their corresponding unlocks on all feasible execution paths, and (2) EEV for Allocation/Deallocation Pairing to verify the correct pairing of memory allocation with its corresponding deallocations on all feasible execution paths. We applied the EEV approach to verify recent versions of the Linux kernel. The results include a comparison with the state-of-the-art Linux Driver Verification (LDV) tool, effectiveness of the proposed visual models as verification-critical evidence, representative examples of verification, the discovered bugs, and limitations of the proposed approach

    Scalable Applications on Heterogeneous System Architectures: A Systematic Performance Analysis Framework

    Get PDF
    The efficient parallel execution of scientific applications is a key challenge in high-performance computing (HPC). With growing parallelism and heterogeneity of compute resources as well as increasingly complex software, performance analysis has become an indispensable tool in the development and optimization of parallel programs. This thesis presents a framework for systematic performance analysis of scalable, heterogeneous applications. Based on event traces, it automatically detects the critical path and inefficiencies that result in waiting or idle time, e.g. due to load imbalances between parallel execution streams. As a prerequisite for the analysis of heterogeneous programs, this thesis specifies inefficiency patterns for computation offloading. Furthermore, an essential contribution was made to the development of tool interfaces for OpenACC and OpenMP, which enable a portable data acquisition and a subsequent analysis for programs with offload directives. At present, these interfaces are already part of the latest OpenACC and OpenMP API specification. The aforementioned work, existing preliminary work, and established analysis methods are combined into a generic analysis process, which can be applied across programming models. Based on the detection of wait or idle states, which can propagate over several levels of parallelism, the analysis identifies wasted computing resources and their root cause as well as the critical-path share for each program region. Thus, it determines the influence of program regions on the load balancing between execution streams and the program runtime. The analysis results include a summary of the detected inefficiency patterns and a program trace, enhanced with information about wait states, their cause, and the critical path. In addition, a ranking, based on the amount of waiting time a program region caused on the critical path, highlights program regions that are relevant for program optimization. The scalability of the proposed performance analysis and its implementation is demonstrated using High-Performance Linpack (HPL), while the analysis results are validated with synthetic programs. A scientific application that uses MPI, OpenMP, and CUDA simultaneously is investigated in order to show the applicability of the analysis

    Implementacion de un algoritmo de marca de agua para la detección de modificaciones en videos sobre un sistema embebido BeagleBoard-xM

    Get PDF
    Proyecto de Graduación (Licenciatura en Ingeniería Electrónica) Instituto Tecnológico de Costa Rica, Escuela de Ingeniería Electrónica, 2013.El presente documento describe el proceso de implementación de un sistema para la detección de alteraciones en videos codificados según el estándar H.264 mediante la inserción de una marca de agua digital. El sistema es implementado es una plataforma embebida BeagleBoard-xM, operada mediante un kernel GNU/Linux. El manejo de la secuencia de video se realiza utilizando GStreamer, por lo que el algoritmo de marca de agua es implementado como un elemento de esta API. El elemento realiza la función de codifi cación/decodifi cación de la secuencia de video utilizando el procesador ARM, mientras que la inserción/detección de la marca de agua se ejecuta en el DSP de la plataforma

    Jahresbericht 2017 zur kooperativen DV-Versorgung

    Get PDF
    :Vorwort 13 Übersicht der Inserenten 16 Teil I Gremien der TU Dresden für Belange der Informationstechnik CIO der TU Dresden 21 CIO-Beirat 21 IT-Klausurtagung 23 Teil II Zentrum für Informationsdienste und Hochleistungsrechnen 1 Die Einrichtung 27 1.1 Aufgaben 27 1.2 Zahlen und Fakten 27 1.3 Haushalt 28 1.4 Struktur 30 1.5 Standorte 31 1.6 Gremienarbeit 32 2 IT-Infrastruktur 33 2.1 Kommunikationsdienste und Infrastrukturen 33 2.2 Infrastruktur-Server 43 2.3 Server-Virtualisierung 44 2.4 Housing 44 2.5 Datenspeicher und -sicherung 44 3 Hochleistungsrechnen 51 3.1 HRSK-II – HPC-Cluster Taurus 51 3.2 Shared-Memory-System Venus 53 3.3 Anwendungssoftware 54 3.4 Parallele Programmierwerkzeuge 54 4 Zentrales Diensteangebot 57 4.1 IT-Service-Management 57 4.2 Ticket-System und Service Desk 57 4.3 Identitätsmanagement 59 4.4 Login-Service 61 4.5 Microsoft Windows-Support 61 4.6 Kommunikations- und Kollaborationsdienste 65 4.7 Dresden Science Calendar 67 4.8 Drucken / Kopieren 68 4.9 Zentrale Software-Beschaffung für die TU Dresden 69 5 Forschung 71 5.1 Wissenschaftliche Projekte und Kooperationen 71 5.2 Publikationen 85 6 Ausbildungsbetrieb und Praktika 89 6.1 Ausbildung zum Fachinformatiker 89 6.2 Praktika 89 7 Veranstaltungen 91 7.1 Aus- und Weiterbildungsveranstaltungen 91 7.2 ZIH-Kolloquien 92 7.3 Workshops 92 7.4 Standpräsentationen/Vorträge/Führungen 92 Teil III Bereiche der TU Dresden Bereich Mathematik und Naturwissenschaften 97 1 Bereichsweite IT-Struktur 97 2 Weiterbildung und Informationsaustausch 97 3 Service Desk 98 4 Stand der DV-Ausstattung – allgemeine Hinweise 98 5 Anforderungen an das ZIH 98 5.1 Dienste 98 5.2 Vernetzung 99 5.3 Software 99 Fakultät Biologie 101 1 DV-Anforderungen aus Lehre und Forschung 101 1.1 Anforderungen aus der Lehre 101 1.2 Anforderungen aus der Forschung 102 2 Erreichter Stand der DV-Ausstattung 102 3 Anforderungen an das ZIH 102 Fakultät Chemie und Lebensmittelchemie 103 1 DV-Anforderungen aus Lehre und Forschung 103 1.1 Anforderungen aus der Lehre 103 1.2 Anforderungen aus der Forschung 103 2 Stand der DV-Ausstattung 104 2.1 Verzeichnisdienst und zentrales Management 104 2.2 Server-Systeme 104 2.3 PC-Arbeitsplätze und Messrechner 105 2.4 Datennetz 105 3 Leistungen und Angebote der Fakultät 105 3.1 PC-Pools 105 3.2 Messdaten und Datenbanken 105 3.3 Spezialsoftware 106 3.4 IT-Service-Teams 106 4 Anforderungen der Fakultät an ZIH und Verwaltung 106 4.1 Dienste und Software 106 4.2 Personelle Absicherung 106 Fakultät Mathematik 107 1 DV-Anforderungen aus Lehre und Forschung 107 1.1 Anforderungen aus der Lehre 107 1.2 Anforderungen aus der Forschung 107 2 Erreichter Stand der DV-Ausstattung an der Fakultät 108 2.1 Hardware und Vernetzung 108 2.2 Leistungen und Angebote des zentralen PC-Pools der Fakultät 108 3 Anforderungen der Fakultät an das ZIH 108 3.1 Dienste 108 3.2 Datenkommunikation 109 3.3 Software 109 3.4 Hardware- und Software-Service 109 Fakultät Physik 111 1 DV-Anforderungen aus Lehre und Forschung 111 1.1 Anforderungen aus der Lehre 111 1.2 Anforderungen aus der Forschung 112 2 Erreichter Stand der DV-Ausstattung 113 2.1 Hardware 113 2.2 Software 113 2.3 Vernetzung 113 2.4 PC-Pools 113 2.5 Weiteres 113 3 Anforderungen der Fakultät an das ZIH 114 Fakultät Psychologie 115 1 DV-Anforderungen aus Lehre und Forschung 115 1.1 Anforderungen aus der Lehre 115 1.2 Anforderungen aus der Forschung 115 2 Erreichter Stand der DV-Ausstattung an der Fakultät 115 3 Anforderungen der Fakultät an das ZIH 116 Bereich Geistes- und Sozialwissenschaften 117 1 Struktur und IT-Verantwortlichkeiten 117 2 Fazit und Entwicklungsperspektiven 118 Fakultät Erziehungswissenschaften 121 1 DV-Anforderungen aus Lehre und Forschung 121 1.1 Anforderungen aus der Lehre 121 1.2 Anforderungen aus der Forschung 123 2 Erreichter Stand der DV-Ausstattung an der Fakultät 124 3 Leistungen und Angebote des ZBT der Fakultät 124 4 Anforderungen an das ZIH 125 Juristische Fakultät 127 1 DV-Anforderungen aus Lehre und Forschung 127 1.1 Anforderungen aus der Lehre 127 1.2 Anforderungen aus der Forschung 127 2 Stand der DV-Ausstattung an der Fakultät 128 3 Anforderung an das ZIH sowie externe Ressourcen 128 Philosophische Fakultät 129 1 DV-Anforderungen aus Lehre und Forschung 129 1.1 Anforderungen aus der Lehre 129 1.2 Anforderungen aus der Forschung 129 2 Erreichter Stand der DV-Ausstattung an der Fakultät 130 3 Anforderungen an das ZIH 130 Fakultät Sprach-, Literatur- und Kulturwissenschaften 133 1 DV-Anforderungen aus Lehre und Forschung 133 1.1 Anforderungen aus der Lehre 133 1.2 Anforderungen aus der Forschung 133 2 Erreichter Stand der DV-Ausstattung an der Fakultät 134 3 Anforderung an das ZIH 134 4 E-Learning-Strategie 134 Bereich Bau und Umwelt 137 1 Struktur und IT-Verantwortlichkeiten 137 2 Kompetenzen, angebotene Dienstleistungen und mögliche Synergien 139 3 Fazit und Ausblick 141 Fakultät Architektur 143 1 DV-Anforderungen aus Lehre und Forschung 143 1.1 Anforderungen aus der Lehre 143 1.2 Anforderungen aus der Forschung 144 2 Erreichter Stand der DV-Ausstattung an der Fakultät 144 3 Leistungen und Angebote der Fakultät Architektur 145 4 Anforderungen an das ZIH sowie externe Ressourcen 145 4.1 Dienste 145 4.2 Datenkommunikation 145 4.3 Software 146 4.4 Hardware- und Software-Service 146 Fakultät Bauingenieurwesen 147 1 DV-Anforderungen aus Lehre und Forschung 147 1.1 Anforderungen aus der Lehre 147 1.2 Anforderungen aus der Forschung 148 1.3 Erreichter Stand der DV-Ausstattung an der Fakultät 150 2 Leistungen und Angebote des zentralen Fakultätsrechenzentrums 157 3 Anforderungen an das ZIH sowie externe Ressourcen 157 3.2 Datenkommunikation 158 3.3 Software 158 3.4 Hardware- und Software-Service 158 Fakultät Umweltwissenschaften 159 Fachrichtung Forstwissenschaften 159 1 DV-Anforderungen aus Lehre und Forschung 159 1.1 Anforderungen aus der Lehre 159 1.2 Anforderungen aus der Forschung (ausgewählte Beispiele) 159 2 Erreichter Stand der DV-Ausstattung an der Fachrichtung 160 3 Leistungen und Angebote der Rechenstation der Fachrichtung 161 4 Anforderungen an das ZIH sowie externe Ressourcen 161 4.1 Dienste 161 4.2 Datenkommunikation 161 4.3 Software 161 4.4 Hardware- und Software-Service 161 Fachrichtung Geowissenschaften 163 1 DV-Anforderungen aus Lehre und Forschung 163 1.1 Anforderungen aus der Lehre 163 1.2 Anforderung aus der Forschung 163 2 Anforderung an das ZIH 165 2.1 Dienste 165 2.2 Datenkommunikation 165 2.3 Software 165 2.4 Hardware- und Software-Service 167 3 Anforderung an die Rechenstation Tharandt 167 Fakultät Verkehrswissenschaften „Friedrich List“ 169 1 DV-Anforderungen aus Lehre und Forschung 169 1.1 Anforderungen aus der Lehre 169 1.2 Anforderung aus der Forschung 171 2 Anforderungen an das ZIH 175 Fakultät Wirtschaftswissenschaften 177 1 DV-Anforderungen aus Lehre und Forschung 177 1.1 Anforderungen aus der Lehre 177 1.2 Anforderungen aus der Forschung 179 2 Erreichter Stand der DV-Ausstattung an der Fakultät 180 3 Service-Leistungen des Informatiklabors der Fakultät 182 4 Anforderungen an das ZIH sowie externe Ressourcen 184 4.1 Dienste 184 4.2 Datenkommunikation 184 4.3 Software 185 4.4 Hardware- und Software-Service 185 Bereich Medizin 187 Medizinische Fakultät Carl Gustav Carus 187 1 DV-Anforderungen aus Lehre und Forschung 187 1.1 Anforderungen aus der Lehre 187 1.2 Anforderungen aus der Forschung 188 2 Erreichter Stand der DV-Versorgung 188 3 Anforderungen der Fakultät an das ZIH / MZ / SLUB 19

    Evaluating Extensible 3D (X3D) Graphics For Use in Software Visualisation

    No full text
    3D web software visualisation has always been expensive, special purpose, and hard to program. Most of the technologies used require large amounts of scripting, are not reliable on all platforms, are binary formats, or no longer maintained. We can make end-user web software visualisation of object-oriented programs cheap, portable, and easy by using Extensible (X3D) 3D Graphics, which is a new open standard. In this thesis we outline our experience with X3D and discuss the suitability of X3D as an output format for software visualisation

    Intelligent instrumentation techniques to improve the traces information-volume ratio

    Get PDF
    With ever more powerful machines being constantly deployed, it is crucial to manage the computational resources efficiently. This is important both from the point of view of the individual user, who expects fast results; and the supercomputing center hosting the whole infrastructure, that is interested in maximizing its overall productivity. Nevertheless, the real sustained performance achieved by the applications can be significantly lower than the theoretical peak performance of the machines. A key factor to bridge this performance gap is to understand how parallel computers behave. Performance analysis tools are essential not only to understand the behavior of parallel applications, but to identify why performance expectations might not have been met, serving as guidelines to improve the inefficiencies that caused poor performance, and driving both software and hardware optimizations. However, detailed analysis of the behavior of a parallel application requires to process a large amount of data that also grows extremely fast. Current large scale systems already comprise hundreds of thousands of cores, and upcoming exascale systems are expected to assemble more than a million processing elements. With such number of hardware components, the traditional analysis methodologies consisting in blindly collecting as much data as possible and then performing exhaustive lookups are no longer applicable, because the volume of performance data generated becomes absolutely unmanageable to store, process and analyze. The evolution of the tools suggests that more complex approaches are needed, incorporating intelligence to perform competently the challenging and important task of detailed analysis. In this thesis, we address the problem of scalability of performance analysis tools in large scale systems. In such scenarios, in-depth understanding of the interactions between all the system components is more compelling than ever for an effective use of the parallel resources. To this end, our work includes a thorough review of techniques that have been successfully applied to aid in the task of Big Data Analytics in fields like machine learning, data mining, signal processing and computer vision. We have leveraged these techniques to improve the analysis of large-scale parallel applications by automatically uncovering repetitive patterns, finding data correlations, detecting performance trends and further useful analysis information. Combinining their use, we have minimized the volume of performance data captured from an execution, while maximizing the benefit and insight gained from this data, and have proposed new and more effective methodologies for single and multi-experiment performance analysis.Con el incesante aumento de potencia y capacidad de los superordenadores, la habilidad de emplear de forma efectiva todos los recursos disponibles se ha convertido en un factor crucial. La necesidad de un uso eficiente radica tanto en la aspiración de los usuarios por obtener resultados en el menor tiempo posible, como en el interés del propio centro de cálculo que alberga la infraestructura computacional por maximizar la productividad de los recursos. Sin embargo, el rendimiento real que las aplicaciones son capaces de alcanzar suele ser significativamente menor que el rendimiento teórico de las máquinas. Y la clave para salvar esta distancia consiste en comprender el comportamiento de las máquinas paralelas. Las herramientas de análisis de rendimiento son instrumentos fundamentales no solo para entender como funcionan las aplicaciones paralelas, sino también para identificar los problemas por los que el rendimiento obtenido dista del esperado, sirviendo como guías para mejorar aquellas deficiencias software y/o hardware que son causas de degradación. No obstante, un análisis en detalle del comportamiento de una aplicación paralela requiere procesar una gran cantidad de datos que crece extremadamente rápido. Los sistemas actuales de gran escala ya comprenden cientos de miles de procesadores, y se espera que los inminentes sistemas exa-escala reunan millones de elementos de procesamiento. Con semejante número de componentes, las estrategias tradicionales de obtención indiscriminada de datos para mejorar la precisión de las herramientas de análisis caerán en desuso debido a las dificultades que entraña almacenarlos y procesarlos. En este aspecto, la evolución de las herramientas sugiere que son necesarios métodos más sofisticados, que incorporen inteligencia para desarrollar la tarea de análisis de manera más competente. Esta tesis aborda el problema de escalabilidad de las herramientas de análisis en sistemas de gran escala, donde es primordial el conocimiento detallado de las interacciones entre todos los componentes para emplear los recursos paralelos de la forma más óptima. Con este fin, esta investigación incluye una revisión exhaustiva de las técnicas que se han aplicado satisfactoriamente para extraer información de grandes volumenes de datos en otras áreas como aprendizaje automático, minería de datos y procesado de señal. Hemos adaptado estas técnicas para mejorar el análisis de aplicaciones paralelas de gran escala, detectando automáticamente patrones repetitivos, correlaciones de datos, tendencias de rendimiento, y demás información relevante. Combinando el uso de estas técnicas, se ha conseguido disminuir el volumen de datos generado durante una ejecución, a la vez que aumentar la cantidad de información útil que se puede extraer de los datos mediante la aplicación de nuevas y más efectivas metodologías de análisis para el estudio del rendimiento de experimentos individuales o en seri
    corecore