Search CORE

8 research outputs found

Arquitecturas multiprocesador en cómputo de altas prestaciones: software de base, métricas y aplicaciones

Author: Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura Cristina
Denham Mónica Malén
Frati Fernando Emmanuel
Iglesias Luciano
Montezanti Diego Miguel
Méndez Mariano
Naiouf Marcelo
Pousa Adrián
Rodriguez Eguren Sebastián
Rodriguez Ismael Pablo
Tinetti Fernando Gustavo
Villagarcía Wanza Horacio A.
Publication venue
Publication date: 18/11/2014
Field of study

Caracterizar las arquitecturas multiprocesador distribuidas enfocadas especialmente a cluster y cloud computing, con énfasis en las que utilizan procesadores de múltiples núcleos (multicores y GPUs), con el objetivo de modelizarlas, estudiar su escalabilidad, analizar y predecir performance de aplicaciones paralelas y desarrollar esquemas de tolerancia a fallas en las mismas. Profundizar el estudio de arquitecturas basadas en GPUs y su comparación con clusters de multicores, así como el empleo combinado de GPUs y multicores en computadoras de alta perfomance. En particular estudiar perfomance en Clusters “híbridos”. Analizar la eficiencia energética en estas arquitecturas paralelas, considerando el impacto de la arquitectura, el sistema operativo, el modelo de programación y el algoritmo específico. Analizar y desarrollar software de base para clusters de multicores y GPUs, tratando de optimizar el rendimiento. En el año 2013 se han incorporado nuevas líneas de interés: - El desarrollo de aplicaciones sobre Cloud y en particular las aplicaciones de Big Data en Cloud. - La utilización de los registros de hardware de los procesadores para la toma de diferentes decisiones en tiempo de ejecución. - El desarrollo de herramientas para la transformación de código heredado, buscando su optimización sobre arquitecturas paralelas. Es de hacer notar que este proyecto se coordina con otros proyectos en curso en el III-LIDI, relacionados con Algoritmos Paralelos, Sistemas Distribuidos y Sistemas de Tiempo Real.Eje: Procesamiento Distribuido y ParaleloRed de Universidades con Carreras en Informática (RedUNCI

Arquitecturas multiprocesador en cómputo de altas prestaciones: software de base, métricas y aplicaciones

Author: Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura Cristina
Denham Mónica Malén
Frati Fernando Emmanuel
Iglesias Luciano
Montezanti Diego Miguel
Méndez Mariano
Naiouf Marcelo
Pousa Adrián
Rodriguez Eguren Sebastián
Rodriguez Ismael Pablo
Tinetti Fernando Gustavo
Villagarcía Wanza Horacio A.
Publication venue
Publication date: 01/05/2014
Field of study

Centro de Servicios en Gestión de Información

Servicio de Difusión de la Creación Intelectual

Arquitecturas multiprocesador en cómputo de altas prestaciones: software de base, métricas y aplicaciones

Author: Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura
Denham Mónica
Frati Fernando Emmanuel
Iglesias Luciano
Montezanti Diego
Méndez Mariano
Naiouf Marcelo
Pousa Adrián
Rodriguez Eguren Sebastián
Rodríguez Ismael Pablo
Tinetti Fernando Gustavo
Villagarcía Wanza Horacio Alfredo
Publication venue
Publication date: 01/05/2014
Field of study

Caracterizar las arquitecturas multiprocesador distribuidas enfocadas especialmente a cluster y cloud computing, con énfasis en las que utilizan procesadores de múltiples núcleos (multicores y GPUs), con el objetivo de modelizarlas, estudiar su escalabilidad, analizar y predecir performance de aplicaciones paralelas y desarrollar esquemas de tolerancia a fallas en las mismas.\nProfundizar el estudio de arquitecturas basadas en GPUs y su comparación con clusters de multicores, así como el empleo combinado de GPUs y multicores en computadoras de alta perfomance. En particular estudiar perfomance en Clusters “híbridos”.\nAnalizar la eficiencia energética en estas arquitecturas paralelas, considerando el impacto de la arquitectura, el sistema operativo, el modelo de programación y el algoritmo específico. Analizar y desarrollar software de base para clusters de multicores y GPUs, tratando de optimizar el rendimiento.\nEn el año 2013 se han incorporado nuevas líneas de interés:\n- El desarrollo de aplicaciones sobre Cloud y en particular las aplicaciones de Big Data en Cloud.\n- La utilización de los registros de hardware de los procesadores para la toma de diferentes decisiones en tiempo de ejecución.\n- El desarrollo de herramientas para la transformación de código heredado, buscando su optimización sobre arquitecturas paralelas.\nEs de hacer notar que este proyecto se coordina con otros proyectos en curso en el III-LIDI, relacionados con Algoritmos Paralelos, Sistemas Distribuidos y Sistemas de Tiempo Real.Eje: Procesamiento Distribuido y Paralel

Centro de Servicios en Gestión de Información

Arquitecturas multiprocesador en cómputo de altas prestaciones: software de base, métricas y aplicaciones

Author: Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura Cristina
Denham Mónica Malén
Frati Fernando Emmanuel
Iglesias Luciano
Montezanti Diego Miguel
Méndez Mariano
Naiouf Marcelo
Pousa Adrián
Rodriguez Eguren Sebastián
Rodriguez Ismael Pablo
Tinetti Fernando Gustavo
Villagarcía Wanza Horacio A.
Publication venue
Publication date: 01/05/2014
Field of study

Providing Insight into the Performance of Distributed Applications Through Low-Level Metrics

Author: Eberius David
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2020
Field of study

The field of high-performance computing (HPC) has always dealt with the bleeding edge of computational hardware and software to achieve the maximum possible performance for a wide variety of workloads. When dealing with brand new technologies, it can be difficult to understand how these technologies work and why they work the way they do. One of the more prevalent approaches to providing insight into modern hardware and software is to provide tools that allow developers to access low-level metrics about their performance. The modern HPC ecosystem supports a wide array of technologies, but in this work, I will be focusing on two particularly influential technologies: The Message Passing Interface (MPI), and Graphical Processing Units (GPUs).For many years, MPI has been the dominant programming paradigm in HPC. Indeed, over 90% of applications that are a part of the U.S. Exascale Computing Project plan to use MPI in some fashion. The MPI Standard provides programmers with a wide variety of methods to communicate between processes, along with several other capabilities. The high-level MPI Profiling Interface has been the primary method for profiling MPI applications since the inception of the MPI Standard, and more recently the low-level MPI Tool Information Interface was introduced.Accelerators like GPUs have been increasingly adopted as the primary computational workhorse for modern supercomputers. GPUs provide more parallelism than traditional CPUs through a hierarchical grid of lightweight processing cores. NVIDIA provides profiling tools for their GPUs that give access to low-level hardware metrics.In this work, I propose research in applying low-level metrics to both the MPI and GPU paradigms in the form of an implementation of low-level metrics for MPI, and a new method for analyzing GPU load imbalance with a synthetic efficiency metric. I introduce Software-based Performance Counters (SPCs) to expose internal metrics of the Open MPI implementation along with a new interface for exposing these counters to users and tool developers. I also analyze a modified load imbalance formula for GPU-based applications that uses low-level hardware metrics provided through nvprof in a hierarchical approach to take the internal load imbalance of the GPU into account

University of Tennessee, Knoxville: Trace

Trace-based Performance Analysis for Hardware Accelerators

Author: Juckeland Guido
Publication venue
Publication date: 05/02/2013
Field of study

This thesis presents how performance data from hardware accelerators can be included in event logs. It extends the capabilities of trace-based performance analysis to also monitor and record data from this novel parallelization layer. The increasing awareness to power consumption of computing devices has led to an interest in hybrid computing architectures as well. High-end computers, workstations, and mobile devices start to employ hardware accelerators to offload computationally intense and parallel tasks, while at the same time retaining a highly efficient scalar compute unit for non-parallel tasks. This execution pattern is typically asynchronous so that the scalar unit can resume other work while the hardware accelerator is busy. Performance analysis tools provided by the hardware accelerator vendors cover the situation of one host using one device very well. Yet, they do not address the needs of the high performance computing community. This thesis investigates ways to extend existing methods for recording events from highly parallel applications to also cover scenarios in which hardware accelerators aid these applications. After introducing a generic approach that is suitable for any API based acceleration paradigm, the thesis derives a suggestion for a generic performance API for hardware accelerators and its implementation with NVIDIA CUPTI. In a next step the visualization of event logs containing data from execution streams on different levels of parallelism is discussed. In order to overcome the limitations of classic performance profiles and timeline displays, a graph-based visualization using Parallel Performance Flow Graphs (PPFGs) is introduced. This novel technical approach is using program states in order to display similarities and differences between the potentially very large number of event streams and, thus, enables a fast way to spot load imbalances. The thesis concludes with the in-depth analysis of a case-study of PIConGPU---a highly parallel, multi-hybrid plasma physics simulation---that benefited greatly from the developed performance analysis methods.Diese Dissertation zeigt, wie der Ablauf von Anwendungsteilen, die auf Hardwarebeschleuniger ausgelagert wurden, als Programmspur mit aufgezeichnet werden kann. Damit wird die bekannte Technik der Leistungsanalyse von Anwendungen mittels Programmspuren so erweitert, dass auch diese neue Parallelitätsebene mit erfasst wird. Die Beschränkungen von Computersystemen bezüglich der elektrischen Leistungsaufnahme hat zu einer steigenden Anzahl von hybriden Computerarchitekturen geführt. Sowohl Hochleistungsrechner, aber auch Arbeitsplatzcomputer und mobile Endgeräte nutzen heute Hardwarebeschleuniger um rechenintensive, parallele Programmteile auszulagern und so den skalaren Hauptprozessor zu entlasten und nur für nicht parallele Programmteile zu verwenden. Dieses Ausführungsschema ist typischerweise asynchron: der Skalarprozessor kann, während der Hardwarebeschleuniger rechnet, selbst weiterarbeiten. Die Leistungsanalyse-Werkzeuge der Hersteller von Hardwarebeschleunigern decken den Standardfall (ein Host-System mit einem Hardwarebeschleuniger) sehr gut ab, scheitern aber an einer Unterstützung von hochparallelen Rechnersystemen. Die vorliegende Dissertation untersucht, in wie weit auch multi-hybride Anwendungen die Aktivität von Hardwarebeschleunigern aufzeichnen können. Dazu wird die vorhandene Methode zur Erzeugung von Programmspuren für hochparallele Anwendungen entsprechend erweitert. In dieser Untersuchung wird zuerst eine allgemeine Methodik entwickelt, mit der sich für jede API-gestützte Hardwarebeschleunigung eine Programmspur erstellen lässt. Darauf aufbauend wird eine eigene Programmierschnittstelle entwickelt, die es ermöglicht weitere leistungsrelevante Daten aufzuzeichnen. Die Umsetzung dieser Schnittstelle wird am Beispiel von NVIDIA CUPTI darstellt. Ein weiterer Teil der Arbeit beschäftigt sich mit der Darstellung von Programmspuren, welche Aufzeichnungen von den unterschiedlichen Parallelitätsebenen enthalten. Um die Einschränkungen klassischer Leistungsprofile oder Zeitachsendarstellungen zu überwinden, wird mit den parallelen Programmablaufgraphen (PPFGs) eine neue graphenbasisierte Darstellungsform eingeführt. Dieser neuartige Ansatz zeigt eine Programmspur als eine Folge von Programmzuständen mit gemeinsamen und unterchiedlichen Abläufen. So können divergierendes Programmverhalten und Lastimbalancen deutlich einfacher lokalisiert werden. Die Arbeit schließt mit der detaillierten Analyse von PIConGPU -- einer multi-hybriden Simulation aus der Plasmaphysik --, die in großem Maße von den in dieser Arbeit entwickelten Analysemöglichkeiten profiert hat

Technische Universität Dresden: Qucosa

Jahresbericht 2011 zur kooperativen DV-Versorgung

Author
Publication venue: Technische Universität Dresden
Publication date: 01/07/2013
Field of study

:VORWORT 9 ÜBERSICHT DER INSERENTEN 10 TEIL I ZUR ARBEIT DER DV-KOMMISSION 15 MITGLIEDER DER DV-KOMMISSION 15 ZUR ARBEIT DES IT-LENKUNGSAUSSCHUSSES 17 ZUR ARBEIT DES WISSENSCHAFTLICHEN BEIRATES DES ZIH 17 TEIL II 1 DAS ZENTRUM FÜR INFORMATIONSDIENSTE UND HOCHLEISTUNGSRECHNEN (ZIH) 21 1.1 AUFGABEN 21 1.2 ZAHLEN UND FAKTEN (REPRÄSENTATIVE AUSWAHL) 21 1.3 HAUSHALT 22 1.4 STRUKTUR / PERSONAL 23 1.5 STANDORT 24 1.6 GREMIENARBEIT 25 2 KOMMUNIKATIONSINFRASTRUKTUR 27 2.1 NUTZUNGSÜBERSICHT NETZDIENSTE 27 2.1.1 WiN-IP-Verkehr 27 2.2 NETZWERKINFRASTRUKTUR 27 2.2.1 Allgemeine Versorgungsstruktur 27 2.2.2 Netzebenen 27 2.2.3 Backbone und lokale Vernetzung 28 2.2.4 Druck-Kopierer-Netz 32 2.2.5 Wireless Local Area Network (WLAN) 32 2.2.6 Datennetz zwischen den Universitätsstandorten und Außenanbindung 32 2.2.7 Vertrag „Kommunikationsverbindungen der Sächsischen Hochschulen“ 33 2.2.8 Datennetz zu den Wohnheimstandorten 38 2.3 KOMMUNIKATIONS- UND INFORMATIONSDIENSTE 39 2.3.1 Electronic-Mail 39 2.3.2 Groupware 42 2.3.3 Authentifizierungs- und Autorisierungs-Infrastruktur (AAI) 42 2.3.4 Wählzugänge 43 2.3.5 Sprachdienste ISDN und VoIP 43 2.3.6 Kommunikationstrassen und Uhrennetz 46 2.3.7 Time-Service 46 3 ZENTRALE DIENSTANGEBOTE UND SERVER 49 3.1 BENUTZERBERATUNG (BB) 49 3.2 TROUBLE TICKET SYSTEM (OTRS) 49 3.3 NUTZERMANAGEMENT 50 3.4 LOGIN-SERVICE 52 3.5 BEREITSTELLUNG VON VIRTUELLEN SERVERN 52 3.6 STORAGE-MANAGEMENT 53 3.6.1 Backup-Service 53 3.6.2 File-Service und Speichersysteme 56 3.7 LIZENZ-SERVICE 57 3.8 PERIPHERIE-SERVICE 57 3.9 PC-POOLS 57 3.10 SECURITY 58 3.10.1 Informationssicherheit 58 3.10.2 Frühwarnsystem (FWS) im Datennetz der TU Dresden 59 3.10.3 VPN 59 3.10.4 Konzept der zentral bereitgestellten virtuellen Firewalls 60 3.10.5 Netzkonzept für Arbeitsplatzrechner mit dynamischer Portzuordnung nach IEEE 802.1x (DyPort) 60 4 SERVICELEISTUNGEN FÜR DEZENTRALE DV-SYSTEME 61 4.1 ALLGEMEINES 61 4.2 PC-SUPPORT 61 4.2.1 Investberatung 61 4.2.2 Implementierung 61 4.2.3 Instandhaltung 61 4.3 MICROSOFT WINDOWS-SUPPORT 62 4.4 ZENTRALE SOFTWARE-BESCHAFFUNG FÜR DIE TU DRESDEN 6 4.4.1 Strategie der Software-Beschaffung 67 4.4.2 Arbeitsgruppentätigkeit 67 4.4.3 Software-Beschaffung 68 4.4.4 Nutzerberatungen 69 4.4.5 Software-Präsentationen 69 5 HOCHLEISTUNGSRECHNEN 71 5.1 HOCHLEISTUNGSRECHNER/SPEICHERKOMPLEX (HRSK) 71 5.1.1 HRSK Core-Router 72 5.1.2 HRSK SGI Altix 4700 72 5.1.3 HRSK PetaByte-Bandarchiv 74 5.1.4 HRSK Linux Networx PC-Farm 75 5.1.5 Globale Home-File-Systeme für HRSK 77 5.2 NUTZUNGSÜBERSICHT DER HPC-SERVER 77 5.3 SPEZIALRESSOURCEN 77 5.3.1 NEC SX-6 78 5.3.2 Microsoft HPC-System 78 5.3.3 Anwendercluster Triton 79 5.3.4 GPU-Cluster 79 5.4 GRID-RESSOURCEN 79 5.5 ANWENDUNGSSOFTWARE 81 5.6 VISUALISIERUNG 82 5.7 PARALLELE PROGRAMMIERWERKZEUGE 83 6 WISSENSCHAFTLICHE PROJEKTE, KOOPERATIONEN 85 6.1 „KOMPETENZZENTRUM FÜR VIDEOKONFERENZDIENSTE“ (VCCIV) 85 6.1.1 Überblick 85 6.1.2 Videokonferenzräume 85 6.1.3 Aufgaben und Entwicklungsarbeiten 85 6.1.4 Weitere Aktivitäten 87 6.1.5 Der Dienst „DFNVideoConference“ − Mehrpunktkonferenzen im G-WiN 88 6.1.6 Tendenzen und Ausblicke 89 6.2 D-GRID 89 6.2.1 D-Grid Scheduler Interoperabilität (DGSI) 89 6.2.2 EMI − European Middleware Initiative 90 6.2.3 MoSGrid − Molecular Simulation Grid 90 6.2.4 WisNetGrid −Wissensnetzwerke im Grid 91 6.2.5 GeneCloud − Cloud Computing in der Medikamentenentwicklung für kleinere und mittlere Unternehmen 91 6.2.6 FutureGrid − An Experimental High-Performance Grid Testbed 92 6.3 BIOLOGIE 92 6.3.1 Entwicklung und Analyse von stochastischen interagierenden Vielteilchen-Modellen für biologische Zellinteraktion 92 6.3.2 SpaceSys − Räumlich-zeitliche Dynamik in der Systembiologie 92 6.3.3 Biologistik − Von bio-inspirierter Logistik zum logistik-inspirierten Bio-Nano-Engineering 93 6.3.4 ZebraSim − Modellierung und Simulation der Muskelgewebsbildung bei Zebrafischen 93 6.3.5 SFB Transregio 79−Werkstoffentwicklungen für die Hartgeweberegeneration im gesunden und systemisch erkrankten Knochen 94 6.3.6 Virtuelle Leber − Raumzeitlich mathematische Modelle zur Untersuchung der Hepatozyten-Polarität und ihre Rolle in der Lebergewebeentwicklung 94 6.3.7 GrowReg −Wachstumsregulation und Strukturbildung in der Regeneration 95 6.4 PERFORMANCE EVALUIERUNG 95 6.4.1 SFB 609 − Elektromagnetische Strömungsbeeinflussung in Metallurgie, Kristallzüchtung und Elektrochemie −Teilprojekt A1: Numerische Modellierung turbulenter MFD-Strömungen 95 6.4.2 SFB 912 − Highly Adaptive Energy-Efficient Computing (HAEC), Teilprojekt A04: Anwendungsanalyse auf Niedrig-Energie HPCSystemence - Low Energy Computer 96 6.4.3 BenchIT − Performance Measurement for Scientific Applications 97 6.4.4 VI-HPS − Virtuelles Institut - HPS 97 6.4.5 Cool Computing −Technologien für Energieeffiziente Computing-Plattformen (BMBF-Spitzencluster Cool Silicon) 97 6.4.6 eeClust − Energieeffizientes Cluster-Computing 98 6.4.7 GASPI- Global Adress Space Programming 98 6.4.8 HI-CFD − Hocheffiziente Implementierung von CFD-Codes für HPC-Many-Core-Architekturen 99 6.4.9 SILC − Scalierbare Infrastruktur zur automatischen Leistungsanalyse paralleler Codes 99 6.4.10 LMAC − Leistungsdynamik massiv-paralleler Codes 100 6.4.11 TIMaCS − Tools for Intelligent System Mangement of Very Large Computing Systems 100 6.4.12 H4H – Optimise HPC Applications on Heterogeneous Architectures 100 6.4.13 HOPSA − HOlistic Performance System Analysis 101 6.4.14 CRESTA − Collaborative Research into Exascale Systemware, Tools and Application 101 6.5 DATENINTENSIVES RECHNEN 102 6.5.1 Radieschen - Rahmenbedingungen einer disziplinübergreifenden Forschungsdaten-Infrastruktur 102 6.5.2 SIOX - Scalable I/O for Extreme Performance 102 6.5.3 HPC-FLiS - HPC-Framework zur Lösung inverser Streuprobleme auf strukturierten Gittern mittels Manycore-Systemen und Anwendung für 3D-bildgebende Verfahren 103 6.5.4 NGSgoesHPC - Skalierbare HPC-Lösungen zur effizienten Genomanalyse 103 6.6 KOOPERATIONEN 104 6.6.1 100-Gigabit-Testbed Dresden/Freiberg 104 6.6.2 Center of Excellence der TU Dresden und der TU Bergakademie Freiberg 107 7 DOIT - INTEGRIERTES INFORMATIONSMANAGEMENT 109 7.1 IDENTITÄTSMANAGEMENT 109 7.2 KOOPERATION MIT DER UNIVERSITÄT LEIPZIG 110 7.3 BESCHAFFUNGSVERFAHREN 111 7.4 EINFÜHRUNGSPROJEKT 111 7.5 ÜBERGANGSLÖSUNG VERZEICHNISDIENST 111 7.5 KONTAKT 111 8 TUDO - TU DRESDEN OPTIMIEREN 113 8.1 AUFBAU DES PROJEKTES TUDO 113 8.2 ZEITPLAN DES PROJEKTES TUDO 114 8.3 WESENTLICHE ERGEBNISSE DES PROJEKTES TUDO 115 9 AUSBILDUNGSBETRIEB UND PRAKTIKA 117 9.1 AUSBILDUNG ZUM FACHINFORMATIKER / FACHRICHTUNG ANWENDUNGSENTWICKLUNG 117 9.2 PRAKTIKA 118 10 AUS- UND WEITERBILDUNGSVERANSTALTUNGEN 119 11 VERANSTALTUNGEN 121 12 PUBLIKATIONEN 123 TEIL III FAKULTÄT MATHEMATIK UND NATURWISSENSCHAFTEN 129 Fachrichtung Mathematik 129 Fachrichtung Physik 133 Fachrichtung Chemie und Lebensmittelchemie 137 Fachrichtung Psychologie 143 Fachrichtung Biologie 147 PHILOSOPHISCHE FAKULTÄT 153 FAKULTÄT SPRACH-, KULTUR- UND LITERATURWISSENSCHAFTEN 155 FAKULTÄT ERZIEHUNGSWISSENSCHAFTEN 157 JURISTISCHE FAKULTÄT 161 FAKULTÄT WIRTSCHAFTSWISSENSCHAFTEN 163 FAKULTÄT INFORMATIK 171 FAKULTÄT BAUINGENIEURWESEN 177 FAKULTÄT ARCHITEKTUR 185 FAKULTÄT VERKEHRSWISSENSCHAFTEN „FRIEDIRCH LIST“ 189 FAKULTÄT FORST-, GEO- HYDROWISSENSCHAFTEN 201 Fachrichtung Forstwissenschaften 201 Fachrichtung Geowissenschaften 205 MEDIZINISCHE FAKULTÄT CARL GUSTAV CARUS 211 BOTANISCHER GARTEN 21

Technische Universität Dresden: Qucosa