8 research outputs found

    Exascale Message Passing Interface based Program Deadlock Detection

    Get PDF
    Deadlock detection is one of the main issues of software testing in High Performance Computing (HPC) and also inexascale computing areas in the near future. Developing and testing programs for machines which have millions of cores is not an easy task. HPC program consists of thousands (or millions) of parallel processes which need to communicate with each other in the runtime. Message Passing Interface (MPI) is a standard library which provides this communication capability and it is frequently used in the HPC. Exascale programs are expected to be developed using MPI standard library. For parallel programs, deadlock is one of the expected problems. In this paper, we discuss the deadlock detection for exascale MPI-based programs where the scalability and efficiency are critical issues. The proposed method detects and flags the processes and communication operations which are potential to cause deadlocks in a scalable and efficient manner. MPI benchmark programs were used to test the proposed method

    ZIH-Info

    Get PDF
    - HRSK-Wartungsarbeiten - VerschlĂźsselung von E-Mails - Ausbau Videokonferenzdienst - Wissensnetzwerke im Grid - Ausschreibung fĂźr SLM und ERP System - Herbsttagung des ZKI-Arbeitskreises Verzeichnisdienste - Visualisierung von Geodaten - AUTOMATA 2009 - Neue ZIH-Publikationen - Veranstaltunge

    Petascale Computing Enabling Technologies Project Final Report

    Full text link

    UPC-CHECK: A scalable tool for detecting run-time errors in Unified Parallel C

    Get PDF
    Unied Parallel C (UPC) is a language used to write parallel programs for shared and distributed memory parallel computers. UPC-CHECK is a scalable tool developed to automatically detect argument errors in UPC functions and deadlocks in UPC programs at run-time and issue high quality error messages to help programmers quickly x those errors. The tool is easy to use and involves merely replacing the compiler command with upc-check. The tool uses a novel distributed algorithm for detecting argument and deadlock errors in collective operations. The run-time complexity of the algorithm has been proven to be O(1). The algorithm has been extended to detect deadlocks created involving locks with a run-time complexity of O(T), where T is the number of threads waiting to acquire a lock. Error messages issued by UPC-CHECK were evaluated using the UPC RTED test suite for argument errors in UPC functions and deadlocks. Results of these tests show that the error messages issued by UPC-CHECK for these tests are excellent. The scalability of all the algorithms used was demonstrated using performance-evaluation test programs and the UPC NAS Parallel Benchmarks

    Doctor of Philosophy

    Get PDF
    dissertationAlmost all high performance computing applications are written in MPI, which will continue to be the case for at least the next several years. Given the huge and growing importance of MPI, and the size and sophistication of MPI codes, scalable and incisive MPI debugging tools are essential. Existing MPI debugging tools have, despite their strengths, many glaring de ficiencies, especially when it comes to debugging under the presence of nondeterminism related bugs, which are bugs that do not always show up during testing. These bugs usually become manifest when the systems are ported to di fferent platforms for production runs. This dissertation focuses on the problem of developing scalable dynamic verifi cation tools for MPI programs that can provide a coverage guarantee over the space of MPI nondeterminism. That is, the tools should be able to detect diff erent outcomes of nondeterministic events in an MPI program and enforce all those di fferent outcomes through repeated executions of the program with the same test harness. We propose to achieve the coverage guarantee by introducing efficient distributed causality tracking protocols that are based on the matches-before order. The matches-before order is introduced to address the shortcomings of the Lamport happens-before order [40], which is not sufficient to capture causality for MPI program executions due to the complexity of the MPI semantics. The two protocols we propose are the Lazy Lamport Clocks Protocol (LLCP) and the Lazy Vector Clocks Protocol (LVCP). LLCP provides good scalability with a small possibility of missing potential outcomes of nondeterministic events while LVCP provides full coverage guarantee with a scalability tradeoff . In practice, we show through our experiments that LLCP provides the same coverage as LVCP. This thesis makes the following contributions: •The MPI matches-before order that captures the causality between MPI events in an MPI execution. • Two distributed causality tracking protocols for MPI programs that rely on the matches-before order. • A Distributed Analyzer for MPI programs (DAMPI), which implements the two aforementioned protocols to provide scalable and modular dynamic verifi cation for MPI programs. • Scalability enhancement through algorithmic improvements for ISP, a dynamic verifi er for MPI programs

    Runtime MPI Correctness Checking with a Scalable Tools Infrastructure

    Get PDF
    Increasing computational demand of simulations motivates the use of parallel computing systems. At the same time, this parallelism poses challenges to application developers. The Message Passing Interface (MPI) is a de-facto standard for distributed memory programming in high performance computing. However, its use also enables complex parallel programing errors such as races, communication errors, and deadlocks. Automatic tools can assist application developers in the detection and removal of such errors. This thesis considers tools that detect such errors during an application run and advances them towards a combination of both precise checks (neither false positives nor false negatives) and scalability. This includes novel hierarchical checks that provide scalability, as well as a formal basis for a distributed deadlock detection approach. At the same time, the development of parallel runtime tools is challenging and time consuming, especially if scalability and portability are key design goals. Current tool development projects often create similar tool components, while component reuse remains low. To provide a perspective towards more efficient tool development, which simplifies scalable implementations, component reuse, and tool integration, this thesis proposes an abstraction for a parallel tools infrastructure along with a prototype implementation. This abstraction overcomes the use of multiple interfaces for different types of tool functionality, which limit flexible component reuse. Thus, this thesis advances runtime error detection tools and uses their redesign and their increased scalability requirements to apply and evaluate a novel tool infrastructure abstraction. The new abstraction ultimately allows developers to focus on their tool functionality, rather than on developing or integrating common tool components. The use of such an abstraction in wide ranges of parallel runtime tool development projects could greatly increase component reuse. Thus, decreasing tool development time and cost. An application study with up to 16,384 application processes demonstrates the applicability of both the proposed runtime correctness concepts and of the proposed tools infrastructure

    Jahresbericht 2009 zur kooperativen DV-Versorgung

    Get PDF
    :VORWORT 9 ÜBERSICHT DER INSERENTEN 10 TEIL I ZUR ARBEIT DER DV KOMMISSION 15 MITGLIEDER DER DV KOMMISSION 15 ZUR ARBEIT DES IT LENKUNGSAUSSCHUSSES 17 ZUR ARBEIT DES WISSENSCHAFTLICHEN BEIRATES DES ZIH 17 TEIL II 1 DAS ZENTRUM FÜR INFORMATIONSDIENSTE UND HOCHLEISTUNGSRECHNEN (ZIH) 21 1.1 AUFGABEN 21 1.2 ZAHLEN UND FAKTEN (REPRÄSENTATIVE AUSWAHL) 21 1.3 HAUSHALT 22 1.4 STRUKTUR / PERSONAL 23 1.5 STANDORT 24 1.6 GREMIENARBEIT 25 2 KOMMUNIKATIONSINFRASTRUKTUR 27 2.1 NUTZUNGSÜBERSICHT NETZDIENSTE 27 2.1.1 WiN IP Verkehr 27 2.2 NETZWERKINFRASTRUKTUR 27 2.2.1 Allgemeine Versorgungsstruktur 27 2.2.2 Netzebenen 27 2.2.3 Backbone und lokale Vernetzung 28 2.2.4 Druck Kopierer Netz 32 2.2.5 WLAN 32 2.2.6 Datennetz zwischen den Universitätsstandorten und Außenanbindung 33 2.2.7 Vertrag „Kommunikationsverbindung der Sächsischen Hochschulen“ 37 2.2.8 Datennetz zu den Wohnheimstandorten 39 2.2.9 Datennetz der Fakultät Informatik 39 2.3 KOMMUNIKATIONS UND INFORMATIONSDIENSTE 40 2.3.1 Electronic Mail 40 2.3.1.1 Einheitliche E-Mail-Adressen an der TU Dresden 41 2.3.1.2 Struktur- bzw. funktionsbezogene E-Mail-Adressen an der TU Dresden 41 2.3.1.3 ZIH verwaltete Nutzer-Mailboxen 42 2.3.1.4 Web-Mail 42 2.3.1.5 Neuer Mailinglisten-Server 43 2.3.2 Authentifizierungs und Autorisierungs Infrastruktur (AAI) 43 2.3.2.1 Shibboleth 43 2.3.2.2 DFN PKI 43 2.3.3 Wählzugänge 44 2.3.4 Time Service 44 2.3.5 Voice over Internet Protocol (VoIP) 44 3 ZENTRALE DIENSTANGEBOTE UND SERVER 47 3.1 BENUTZERBERATUNG (BB) 47 3.2 TROUBLE TICKET SYSTEM (OTRS) 48 3.3 NUTZERMANAGEMENT 49 3.4 LOGIN SERVICE 50 3.5 BEREITSTELLUNG VON VIRTUELLEN SERVERN 51 3.6 STORAGE MANAGEMENT 51 3.6.1 Backup Service 52 3.6.2 File Service und Speichersysteme 55 3.7 LIZENZ SERVICE 56 3.8 PERIPHERIE SERVICE 57 3.9 PC POOLS 57 3.10 SECURITY 58 3.10.1 Informationssicherheit 58 3.10.2 Frühwarnsystem (FWS) im Datennetz der TU Dresden 58 3.10.3 VPN 59 3.10.4 Konzept der zentral bereitgestellten virtuellen Firewalls 59 4 SERVICELEISTUNGEN FÜR DEZENTRALE DV SYSTEME 61 4.1 ALLGEMEINES 61 4.2 PC SUPPORT 61 4.2.1 Investberatung 61 4.2.2 Implementierung 61 4.2.3 Instandhaltung 62 4.3 MICROSOFT WINDOWS SUPPORT 62 4.4 ZENTRALE SOFTWARE BESCHAFFUNG FÜR DIE TU DRESDEN 67 4.4.1 Arbeitsgruppe Software im ZKI 67 4.4.2 Strategie des Software Einsatzes an der TU Dresden 67 4.4.3 Software Beschaffung 68 5 HOCHLEISTUNGSRECHNEN 69 5.1 HOCHLEISTUNGSRECHNER/SPEICHERKOMPLEX (HRSK) 69 5.1.1 HRSK Core Router 70 5.1.2 HRSK SGI Altix 4700 70 5.1.3 HRSK PetaByte Bandarchiv 72 5.1.4 HRSK Linux Networx PC Farm 73 5.1.5 HRSK Linux Networx PC Cluster (HRSK Stufe 1a) 75 5.2 NUTZUNGSÜBERSICHT DER HPC SERVER 76 5.3 SPEZIALRESSOURCEN 77 5.3.1 SGI Origin 3800 77 5.3.2 NEC SX 6 77 5.3.3 Mikrosoft HPC System 78 5.3.4 Anwendercluster 78 5.4 GRID RESSOURCEN 79 5.5 ANWENDUNGSSOFTWARE 81 5.6 VISUALISIERUNG 82 5.7 PARALLELE PROGRAMMIERWERKZEUGE 83 6 WISSENSCHAFTLICHE PROJEKTE, KOOPERATIONEN 85 6.1 „KOMPETENZZENTRUM FÜR VIDEOKONFERENZDIENSTE“ (VCCIV) 85 6.1.1 Überblick 85 6.1.2 Videokonferenzräume 85 6.1.3 Aufgaben und Entwicklungsarbeiten 85 6.1.4 Weitere Aktivitäten 88 6.1.5 Der Dienst „DFNVideoConference“ Mehrpunktkonferenzen im G WiN 88 6.1.6 Ausblick 89 6.2 D GRID 89 6.2.1 Hochenergiephysik Community Grid (HEP CG) − Entwicklung von Anwendungen und Komponenten zur Datenauswertung in der Hochenergiephysik in einer nationalen e Science Umgebung 89 6.2.2 D Grid Integrationsprojekt 2 90 6.2.3 Chemomentum 90 6.2.4 D Grid Scheduler Interoperalität (DGSI) 91 6.2.5 MoSGrid − Molecular Simulation Grid 91 6.2.6 WisNetGrid −Wissensnetzwerke im Grid 92 6.3 BIOLOGIE 92 6.3.1 Entwicklung eines SME freundlichen Zuchtprogramms für Korallen 92 6.3.2 Entwicklung und Analyse von stochastischen interagierenden Vielteilchen Modellen für biologische Zellinteraktion 93 6.3.3 EndoSys − Modellierung der Rolle von Rab Domänen bei Endozytose und Signalverarbeitung in Hepatocyten 93 6.3.4 SpaceSys − Räumlich zeitliche Dynamik in der Systembiologie 94 6.3.5 Biologistik − Von bio inspirierter Logistik zum logistik inspirierten Bio Nano Engineering 94 6.3.6 ZebraSim − Modellierung und Simulation der Muskelgewebsbildung bei Zebrafischen 95 6.4 PERFORMANCE EVALUIERUNG 95 6.4.1 SFB 609 − Elektromagnetische Strömungsbeeinflussung in Metallurgie, Kristallzüchtung und Elektrochemie −Teilprojekt A1: Numerische Modellierung turbulenter MFD Strömungen 95 6.4.2 BenchIT − Performance Measurement for Scientific Applications 96 6.4.3 PARMA − Parallel Programming for Multi core Architectures -ParMA 97 6.4.4 VI HPS − Virtuelles Institut -HPS 97 6.4.5 Paralleles Kopplungs Framework und moderne Zeitintegrationsverfahren für detaillierte Wolkenprozesse in atmosphärischen Modellen 98 6.4.6 VEKTRA − Virtuelle Entwicklung von Keramik und Kompositwerkstoffen mit maßgeschneiderten Transporteigenschaften 98 6.4.7 Cool Computing −Technologien für Energieeffiziente Computing Plattformen (BMBF Spitzencluster Cool Silicon) 99 6.4.8 eeClust Energieeffizientes Cluster Computing 99 6.4.9 HI/CFD − Hocheffiziente Implementierung von CFD Codes für HPC Many Core Architekturen 99 6.4.10 SILC − Scalierbare Infrastruktur zur automatischen Leistungsanalyse paralleler Codes 100 6.4.11 TIMaCS − Tools for Intelligent System Mangement of Very Large Computing Systems 100 6.5 KOOPERATIONEN 101 7 DOIT INTEGRIERTES INFORMATIONSMANAGEMENT 111 7.1 VISION DER TU DRESDEN 111 7.2 ZIELE DES PROJEKTES DOIT 111 7.2.1 Analyse der bestehenden IT Unterstützung der Organisation und ihrer Prozesse 111 7.2.2 Erarbeitung von Verbesserungsvorschlägen 111 7.2.3 Herbeiführung strategischer Entscheidungen 112 7.2.4 Planung und Durchführung von Teilprojekten 112 7.2.5 Markt und Anbieteranalyse 112 7.2.6 Austausch mit anderen Hochschulen 112 7.3 ORGANISATION DES DOIT PROJEKTES 112 7.4 IDENTITÄTSMANAGEMENT 113 7.5 ELEKTRONISCHER KOSTENSTELLENZUGANG (ELKO) 114 8 AUSBILDUNGSBETRIEB UND PRAKTIKA 117 8.1 AUSBILDUNG ZUM FACHINFORMATIKER / FACHRICHTUNG ANWENDUNGSENTWICKLUNG 117 8.2 PRAKTIKA 118 9 AUS UND WEITERBILDUNGSVERANSTALTUNGEN 119 10 VERANSTALTUNGEN 121 11 PUBLIKATIONEN 123 TEIL III BERICHTE DER FAKULTÄTEN FAKULTÄT MATHEMATIK UND NATURWISSENSCHAFTEN Fachrichtung Mathematik 129 Fachrichtung Physik 133 Fachrichtung Chemie und Lebensmittelchemie 137 Fachrichtung Psychologie 143 Fachrichtung Biologie 147 PHILOSOHISCHE FAKULTÄT 153 FAKULTÄT SPRACH , LITERATUR UND KULTURWISSENSCHAFTEN 157 FAKULTÄT ERZIEHUNGSWISSENSCHAFTEN 159 JURISTISCHE FAKULTÄT 163 FAKULTÄT WIRTSCHAFTSWISSENSCHAFTEN 167 FAKULTÄT INFORMATIK 175 FAKULTÄT ELEKTRO UND INFORMATIONSTECHNIK 183 FAKULTÄT MASCHINENWESEN 193 FAKULTÄT BAUINGENIEURWESEN 203 FAKULTÄT ARCHITEKTUR 211 FAKULTÄT VERKEHRSWISSENSCHAFTEN „FRIEDRICH LIST“ 215 FAKULTÄT FORST , GEO UND HYDROWISSENSCHAFTEN Fachrichtung Forstwissenschaften 231 Fachrichtung Geowissenschaften 235 Fachrichtung Wasserwesen 241 MEDIZINISCHE FAKULTÄT CARL GUSTAV CARUS 24
    corecore