6 research outputs found

    ZIH-Info

    Get PDF
    - Microsoft-Rahmenvertrag - Verkauf von Handbüchern im Service Desk - VoIP-Installation im Campus - Neues Speichersystem von Hitachi - Ausfälle im zentralen Plattensystem - 3D-Visualisierung gemeinsam mit Kartographen - ZIH-Kolloquien - Neue ZIH-Publikationen - Veranstaltunge

    Runtime MPI Correctness Checking with a Scalable Tools Infrastructure

    Get PDF
    Increasing computational demand of simulations motivates the use of parallel computing systems. At the same time, this parallelism poses challenges to application developers. The Message Passing Interface (MPI) is a de-facto standard for distributed memory programming in high performance computing. However, its use also enables complex parallel programing errors such as races, communication errors, and deadlocks. Automatic tools can assist application developers in the detection and removal of such errors. This thesis considers tools that detect such errors during an application run and advances them towards a combination of both precise checks (neither false positives nor false negatives) and scalability. This includes novel hierarchical checks that provide scalability, as well as a formal basis for a distributed deadlock detection approach. At the same time, the development of parallel runtime tools is challenging and time consuming, especially if scalability and portability are key design goals. Current tool development projects often create similar tool components, while component reuse remains low. To provide a perspective towards more efficient tool development, which simplifies scalable implementations, component reuse, and tool integration, this thesis proposes an abstraction for a parallel tools infrastructure along with a prototype implementation. This abstraction overcomes the use of multiple interfaces for different types of tool functionality, which limit flexible component reuse. Thus, this thesis advances runtime error detection tools and uses their redesign and their increased scalability requirements to apply and evaluate a novel tool infrastructure abstraction. The new abstraction ultimately allows developers to focus on their tool functionality, rather than on developing or integrating common tool components. The use of such an abstraction in wide ranges of parallel runtime tool development projects could greatly increase component reuse. Thus, decreasing tool development time and cost. An application study with up to 16,384 application processes demonstrates the applicability of both the proposed runtime correctness concepts and of the proposed tools infrastructure

    Enhanced Encoding Techniques for the Open Trace Format 2

    Get PDF
    AbstractHighly effcient encoding of event trace data is a quality feature of any event trace format. It not only enables measurements of long running applications but also reduces bias caused by intermediate memory buffer flushes. In this paper we present encoding techniques that will remarkably increase memory effciency without introducing overhead for the compression. We applied these techniques to the Open Trace Format 2, a state-of-the-art Open Source event trace data format and library used by the performance analysis tools Vampir, Scalasca, and Tau. In addition, we show that these encoding techniques are a basic step in achieving a complete in-memory event trace work flow

    Jahresbericht 2012 zur kooperativen DV-Versorgung

    Get PDF
    :VORWORT 9 ÜBERSICHT DER INSERENTEN 10 TEIL I ZUR ARBEIT DER DV-KOMMISSION 15 MITGLIEDER DER DV-KOMMISSION 15 ZUR ARBEIT DES IT-LENKUNGSAUSSCHUSSES 17 ZUR ARBEIT DES WISSENSCHAFTLICHEN BEIRATES DES ZIH 17 TEIL II 1 DAS ZENTRUM FÜR INFORMATIONSDIENSTE UND HOCHLEISTUNGSRECHNEN (ZIH) 21 1.1 AUFGABEN 21 1.2 ZAHLEN UND FAKTEN (REPRÄSENTATIVE AUSWAHL) 21 1.3 HAUSHALT 22 1.4 STRUKTUR / PERSONAL 23 1.5 STANDORT 24 1.6 GREMIENARBEIT 25 2 KOMMUNIKATIONSINFRASTRUKTUR 27 2.1 NUTZUNGSÜBERSICHT NETZDIENSTE 27 2.1.1 WiN-IP-Verkehr 27 2.2 NETZWERKINFRASTRUKTUR 27 2.2.1 Allgemeine Versorgungsstruktur 27 2.2.2 Netzebenen 28 2.2.3 Backbone und lokale Vernetzung 28 2.2.4 Druck-Kopierer-Netz 32 2.2.5 Wireless Local Area Network (WLAN) 32 2.2.6 Datennetz zwischen den Universitätsstandorten und Außenanbindung 34 2.2.7 Vertrag „Kommunikationsverbindungen der Sächsischen Hochschulen“ 34 2.2.8 Datennetz zu den Wohnheimstandorten 36 2.3 KOMMUNIKATIONS- UND INFORMATIONSDIENSTE 39 2.3.1 Electronic-Mail 39 2.3.1.1 Einheitliche E-Mail-Adressen an der TU Dresden 40 2.3.1.2 Struktur- bzw. funktionsbezogene E-Mail-Adressen an der TU Dresden 41 2.3.1.3 ZIH verwaltete Nutzer-Mailboxen 41 2.3.1.4 Web-Mail 41 2.3.1.5 Mailinglisten-Server 42 2.3.2 Groupware 42 2.3.3 Authentifizierungs- und Autorisierungs-Infrastruktur (AAI) 43 2.3.3.1 AAI für das Bildungsportal Sachsen 43 2.3.3.2 DFN PKI 43 2.3.4 Wählzugänge 43 2.3.5 Sprachdienste ISDN und VoIP 43 2.3.6 Kommunikationstrassen und Uhrennetz 46 2.3.7 Time-Service 46 3 ZENTRALE DIENSTANGEBOTE UND SERVER 47 3.1 BENUTZERBERATUNG (BB) 47 3.2 TROUBLE TICKET SYSTEM (OTRS) 48 3.3 NUTZERMANAGEMENT 48 3.4 LOGIN-SERVICE 50 3.5 BEREITSTELLUNG VON VIRTUELLEN SERVERN 50 3.6 STORAGE-MANAGEMENT 51 3.6.1 Backup-Service 51 3.6.2 File-Service und Speichersysteme 55 3.7 LIZENZ-SERVICE 57 3.8 PERIPHERIE-SERVICE 57 3.9 PC-POOLS 57 3.10 SECURITY 58 3.10.1 Informationssicherheit 58 3.10.2 Frühwarnsystem (FWS) im Datennetz der TU Dresden 59 3.10.3 VPN 59 3.10.4 Konzept der zentral bereitgestellten virtuellen Firewalls 60 3.10.5 Netzkonzept für Arbeitsplatzrechner mit dynamischer Portzuordnung nach IEEE 802.1x (DyPort) 60 3.11 DRESDEN SCIENCE CALENDAR 60 4 SERVICELEISTUNGEN FÜR DEZENTRALE DV SYSTEME 63 4.1 ALLGEMEINES 63 4.2 PC-SUPPORT 63 4.2.1 Investberatung 63 4.2.2 Implementierung 63 4.2.3 Instandhaltung 63 4.3 MICROSOFT WINDOWS-SUPPORT 64 4.3.1 Zentrale Windows-Domäne 64 4.3.2 Sophos-Antivirus 70 4.4 ZENTRALE SOFTWARE-BESCHAFFUNG FÜR DIE TU DRESDEN 70 4.4.1 Strategie der Software-Beschaffung 70 4.4.2 Arbeitsgruppentätigkeit 71 4.4.3 Software-Beschaffung 71 4.4.4 Nutzerberatungen 72 4.4.5 Software-Präsentationen 72 5 HOCHLEISTUNGSRECHNEN 73 5.1 HOCHLEISTUNGSRECHNER/SPEICHERKOMPLEX (HRSK) 73 5.1.1 HRSK Core-Router 74 5.1.2 HRSK SGI Altix 4700 74 5.1.3 HRSK PetaByte-Bandarchiv 76 5.1.4 HRSK Linux Networx PC-Farm 77 5.1.5 Datenauswertekomponente Atlas 77 5.1.6 Globale Home-File-Systeme für HRSK 78 5.2 NUTZUNGSÜBERSICHT DER HPC-SERVER 79 5.3 SPEZIALRESSOURCEN 79 5.3.1 Microsoft HPC-System 79 5.3.1 Anwendercluster Triton 80 5.3.3 GPU-Cluster 81 5.4 GRID-RESSOURCEN 81 5.5 ANWENDUNGSSOFTWARE 83 5.6 VISUALISIERUNG 84 5.7 PARALLELE PROGRAMMIERWERKZEUGE 85 6 WISSENSCHAFTLICHE PROJEKTE, KOOPERATIONEN 87 6.1 „KOMPETENZZENTRUM FÜR VIDEOKONFERENZDIENSTE“ (VCCIV) 87 6.1.1 Überblick 87 6.1.2 Videokonferenzräume 87 6.1.3 Aufgaben und Entwicklungsarbeiten 87 6.1.4 Weitere Aktivitäten 89 6.1.5 Der Dienst „DFNVideoConference“ − Mehrpunktkonferenzen im X-WiN 90 6.1.6 Tendenzen und Ausblicke 91 6.2 D-GRID 91 6.2.1 D-Grid Scheduler Interoperabilität (DGSI) 91 6.2.2 EMI − European Middleware Initiative 92 6.2.3 MoSGrid − Molecular Simulation Grid 92 6.2.4 WisNetGrid −Wissensnetzwerke im Grid 93 6.2.5 GeneCloud − Cloud Computing in der Medikamentenentwicklung für kleinere und mittlere Unternehmen 93 6.2.6 FutureGrid − An Experimental High-Performance Grid Testbed 94 6.3 BIOLOGIE 94 6.3.1 Entwicklung und Analyse von stochastischen interagierenden Vielteilchen-Modellen für biologische Zellinteraktion 94 6.3.2 SpaceSys − Räumlichzeitliche Dynamik in der Systembiologie 95 6.3.3 ZebraSim − Modellierung und Simulation der Muskelgewebsbildung bei Zebrafischen 95 6.3.4 SFB Transregio 79−Werkstoffentwicklungen für die Hartgewebe regeneration im gesunden und systemisch erkrankten Knochen 96 6.3.5 Virtuelle Leber − Raumzeitlich mathematische Modelle zur Untersuchung der Hepatozyten Polarität und ihre Rolle in der Lebergewebeentwicklung 96 6.3.6 GrowReg −Wachstumsregulation und Strukturbildung in der Regeneration 96 6.3.7 GlioMath Dresden 97 6.4 PERFORMANCE EVALUIERUNG 97 6.4.1 SFB 609 − Elektromagnetische Strömungsbeeinflussung in Metallurgie, Kristallzüchtung und Elektrochemie −Teilprojekt A1: Numerische Modellierung turbulenter MFD Strömungen 97 6.4.2 SFB 912 − Highly Adaptive Energy Efficient Computing (HAEC), Teilprojekt A04: Anwendungsanalyse auf Niedrig Energie HPC Systemence Low Energy Computer 98 6.4.3 BenchIT − Performance Measurement for Scientific Applications 99 6.4.4 Cool Computing −Technologien für Energieeffiziente Computing Plattformen (BMBF Spitzencluster Cool Silicon) 99 6.4.5 Cool Computing 2 −Technologien für Energieeffiziente Computing Plattformen (BMBF Spitzencluster Cool Silicon) 100 6.4.6 ECCOUS − Effiziente und offene Compiler Umgebung für semantisch annotierte parallele Simulationen 100 6.4.7 eeClust − Energieeffizientes Cluster Computing 101 6.4.8 GASPI − Global Adress Space Programming 101 6.4.9 LMAC − Leistungsdynamik massiv paralleler Codes 102 6.4.10 H4H – Optimise HPC Applications on Heterogeneous Architectures 102 6.4.11 HOPSA − HOlistic Performance System Analysis 102 6.4.12 CRESTA − Collaborative Research into Exascale Systemware, Tools and Application 103 6.5 DATENINTENSIVES RECHNEN 104 6.5.1 Langzeitarchivierung digitaler Dokumente der SLUB 104 6.5.2 LSDMA − Large Scale Data Management and Analysis 104 6.5.3 Radieschen − Rahmenbedingungen einer disziplinübergreifenden Forschungsdaten Infrastruktur 105 6.5.4 SIOX − Scalable I/O for Extreme Performance 105 6.5.5 HPC FLiS − HPC Framework zur Lösung inverser Streuprobleme auf strukturierten Gittern mittels Manycore Systemen und Anwendung für 3D bildgebende Verfahren 105 6.5.6 NGSgoesHPC − Skalierbare HPC Lösungen zur effizienten Genomanalyse 106 6.6 KOOPERATIONEN 106 6.6.1 100 Gigabit Testbed Dresden/Freiberg 106 6.6.1.1 Überblick 106 6.6.1.2 Motivation und Maßnahmen 107 6.6.1.3 Technische Umsetzung 107 6.6.1.4 Geplante Arbeitspakete 108 6.6.2 Center of Excellence der TU Dresden und der TU Bergakademie Freiberg 109 7 AUSBILDUNGSBETRIEB UND PRAKTIKA 111 7.1 AUSBILDUNG ZUM FACHINFORMATIKER / FACHRICHTUNG ANWENDUNGSENTWICKLUNG 111 7.2 PRAKTIKA 112 8 AUS UND WEITERBILDUNGSVERANSTALTUNGEN 113 9 VERANSTALTUNGEN 115 10 PUBLIKATIONEN 117 TEIL III BERICHTE BIOTECHNOLOGISCHES ZENTRUM (BIOTEC) ZENTRUM FÜR REGENERATIVE THERAPIEN (CRTD) ZENTRUM FÜR INNOVATIONSKOMPETENZ (CUBE) 123 BOTANISCHER GARTEN 129 LEHRZENTRUM SPRACHEN UND KULTURRÄUME (LSK) 131 MEDIENZENTRUM (MZ) 137 UNIVERSITÄTSARCHIV (UA) 147 UNIVERSITÄTSSPORTZENTRUM (USZ) 149 MEDIZINISCHES RECHENZENTRUM DES UNIVERSITÄTSKLINIKUMS CARL GUSTAV CARUS (MRZ) 151 ZENTRALE UNIVERSITÄTSVERWALTUNG (ZUV) 155 SÄCHSISCHE LANDESBIBLIOTHEK – STAATS UND UNIVERSITÄTSBIBLIOTHEK DRESDEN (SLUB) 16

    Application of clustering analysis and sequence analysis on the performance analysis of parallel applications

    Get PDF
    High Performance Computing and Supercomputing is the high end area of the computing science that studies and develops the most powerful computers available. Current supercomputers are extremely complex so are the applications that run on them. To take advantage of the huge amount of computing power available it is strictly necessary to maximize the knowledge we have about how these applications behave and perform. This is the mission of the (parallel) performance analysis. In general, performance analysis toolkits oUer a very simplistic manipulations of the performance data. First order statistics such as average or standard deviation are used to summarize the values of a given performance metric, hiding in some cases interesting facts available on the raw performance data. For this reason, we require the Performance Analytics, i.e. the application of Data Analytics techniques in the performance analysis area. This thesis contributes with two new techniques to the Performance Analytics Veld. First contribution is the application of the cluster analysis to detect the parallel application computation structure. Cluster analysis is the unsupervised classiVcation of patterns (observations, data items or feature vectors) into groups (clusters). In this thesis we use the cluster analysis to group the CPU burst of a parallel application, the regions on each process in-between communication calls or calls to the parallel runtime. The resulting clusters obtained are the diUerent computational trends or phases that appear in the application. These clusters are useful to understand the behaviour of computation part of the application and focus the analyses to those that present performance issues. We demonstrate that our approach requires diUerent clustering algorithms previously used in the area. Second contribution of the thesis is the application of multiple sequence alignment algorithms to evaluate the computation structure detected. Multiple sequence alignment (MSA) is technique commonly used in bioinformatics to determine the similarities across two or more biological sequences: DNA or roteins. The Cluster Sequence Score we introduce applies a Multiple Sequence Alignment (MSA) algorithm to evaluate the SPMDiness of an application, i.e. how well its computation structure represents the Single Program Multiple Data (SPMD) paradigm structure. We also use this score in the Aggregative Cluster Re-Vnement, a new clustering algorithm we designed, able to detect the SPMD phases of an application at Vne-grain, surpassing the cluster algorithms we used initially. We demonstrate the usefulness of these techniques with three practical uses. The Vrst one is an extrapolation methodology able to maximize the performance metrics that characterize the application phases detected using a single application execution. The second one is the use of the computation structure detected to speedup in a multi-level simulation infrastructure. Finally, we analyse four production-class applications using the computation characterization to study the impact of possible application improvements and portings of the applications to diUerent hardware conVgurations. In summary, this thesis proposes the use of cluster analysis and sequence analysis to automatically detect and characterize the diUerent computation trends of a parallel application. These techniques provide the developer / analyst an useful insight of the application performance and ease the understanding of the application’s behaviour. The contributions of the thesis are not reduced to proposals and publications of the techniques themselves, but also practical uses to demonstrate their usefulness in the analysis task. In addition, the research carried out during these years has provided a production tool for analysing applications’ structure, part of BSC Tools suite

    Intelligent instrumentation techniques to improve the traces information-volume ratio

    Get PDF
    With ever more powerful machines being constantly deployed, it is crucial to manage the computational resources efficiently. This is important both from the point of view of the individual user, who expects fast results; and the supercomputing center hosting the whole infrastructure, that is interested in maximizing its overall productivity. Nevertheless, the real sustained performance achieved by the applications can be significantly lower than the theoretical peak performance of the machines. A key factor to bridge this performance gap is to understand how parallel computers behave. Performance analysis tools are essential not only to understand the behavior of parallel applications, but to identify why performance expectations might not have been met, serving as guidelines to improve the inefficiencies that caused poor performance, and driving both software and hardware optimizations. However, detailed analysis of the behavior of a parallel application requires to process a large amount of data that also grows extremely fast. Current large scale systems already comprise hundreds of thousands of cores, and upcoming exascale systems are expected to assemble more than a million processing elements. With such number of hardware components, the traditional analysis methodologies consisting in blindly collecting as much data as possible and then performing exhaustive lookups are no longer applicable, because the volume of performance data generated becomes absolutely unmanageable to store, process and analyze. The evolution of the tools suggests that more complex approaches are needed, incorporating intelligence to perform competently the challenging and important task of detailed analysis. In this thesis, we address the problem of scalability of performance analysis tools in large scale systems. In such scenarios, in-depth understanding of the interactions between all the system components is more compelling than ever for an effective use of the parallel resources. To this end, our work includes a thorough review of techniques that have been successfully applied to aid in the task of Big Data Analytics in fields like machine learning, data mining, signal processing and computer vision. We have leveraged these techniques to improve the analysis of large-scale parallel applications by automatically uncovering repetitive patterns, finding data correlations, detecting performance trends and further useful analysis information. Combinining their use, we have minimized the volume of performance data captured from an execution, while maximizing the benefit and insight gained from this data, and have proposed new and more effective methodologies for single and multi-experiment performance analysis.Con el incesante aumento de potencia y capacidad de los superordenadores, la habilidad de emplear de forma efectiva todos los recursos disponibles se ha convertido en un factor crucial. La necesidad de un uso eficiente radica tanto en la aspiración de los usuarios por obtener resultados en el menor tiempo posible, como en el interés del propio centro de cálculo que alberga la infraestructura computacional por maximizar la productividad de los recursos. Sin embargo, el rendimiento real que las aplicaciones son capaces de alcanzar suele ser significativamente menor que el rendimiento teórico de las máquinas. Y la clave para salvar esta distancia consiste en comprender el comportamiento de las máquinas paralelas. Las herramientas de análisis de rendimiento son instrumentos fundamentales no solo para entender como funcionan las aplicaciones paralelas, sino también para identificar los problemas por los que el rendimiento obtenido dista del esperado, sirviendo como guías para mejorar aquellas deficiencias software y/o hardware que son causas de degradación. No obstante, un análisis en detalle del comportamiento de una aplicación paralela requiere procesar una gran cantidad de datos que crece extremadamente rápido. Los sistemas actuales de gran escala ya comprenden cientos de miles de procesadores, y se espera que los inminentes sistemas exa-escala reunan millones de elementos de procesamiento. Con semejante número de componentes, las estrategias tradicionales de obtención indiscriminada de datos para mejorar la precisión de las herramientas de análisis caerán en desuso debido a las dificultades que entraña almacenarlos y procesarlos. En este aspecto, la evolución de las herramientas sugiere que son necesarios métodos más sofisticados, que incorporen inteligencia para desarrollar la tarea de análisis de manera más competente. Esta tesis aborda el problema de escalabilidad de las herramientas de análisis en sistemas de gran escala, donde es primordial el conocimiento detallado de las interacciones entre todos los componentes para emplear los recursos paralelos de la forma más óptima. Con este fin, esta investigación incluye una revisión exhaustiva de las técnicas que se han aplicado satisfactoriamente para extraer información de grandes volumenes de datos en otras áreas como aprendizaje automático, minería de datos y procesado de señal. Hemos adaptado estas técnicas para mejorar el análisis de aplicaciones paralelas de gran escala, detectando automáticamente patrones repetitivos, correlaciones de datos, tendencias de rendimiento, y demás información relevante. Combinando el uso de estas técnicas, se ha conseguido disminuir el volumen de datos generado durante una ejecución, a la vez que aumentar la cantidad de información útil que se puede extraer de los datos mediante la aplicación de nuevas y más efectivas metodologías de análisis para el estudio del rendimiento de experimentos individuales o en seri
    corecore