401 research outputs found

    Proactive Interference-aware Resource Management in Deep Learning Training Cluster

    Get PDF
    Deep Learning (DL) applications are growing at an unprecedented rate across many domains, ranging from weather prediction, map navigation to medical imaging. However, training these deep learning models in large-scale compute clusters face substantial challenges in terms of low cluster resource utilisation and high job waiting time. State-of-the-art DL cluster resource managers are needed to increase GPU utilisation and maximise throughput. While co-locating DL jobs within the same GPU has been shown to be an effective means towards achieving this, co-location subsequently incurs performance interference resulting in job slowdown. We argue that effective workload placement can minimise DL cluster interference at scheduling runtime by understanding the DL workload characteristics and their respective hardware resource consumption. However, existing DL cluster resource managers reserve isolated GPUs to perform online profiling to directly measure GPU utilisation and kernel patterns for each unique submitted job. Such a feedback-based reactive approach results in additional waiting times as well as reduced cluster resource efficiency and availability. In this thesis, we propose Horus: an interference-aware and prediction-based DL cluster resource manager. Through empirically studying a series of microbenchmarks and DL workload co-location combinations across heterogeneous GPU hardware, we demonstrate the negative effects of performance interference when colocating DL workload, and identify GPU utilisation as a general proxy metric to determine good placement decisions. From these findings, we design Horus, which in contrast to existing approaches, proactively predicts GPU utilisation of heterogeneous DL workload extrapolated from the DL model computation graph features when performing placement decisions, removing the need for online profiling and isolated reserved GPUs. By conducting empirical experimentation within a medium-scale DL cluster as well as a large-scale trace-driven simulation of a production system, we demonstrate Horus improves cluster GPU utilisation, reduces cluster makespan and waiting time, and can scale to operate within hundreds of machines

    Towards Cooperative MARL in Industrial Domains

    Get PDF

    Exploration of Villarrica Geothermal System using Geophysical and Geochemical Techniques [Finale Version]

    Get PDF
    Für den globalen, zukünftigen Energiemix prognostiziert die internationale Energieagentur (IEA) einen erheblichen Beitrag aus geothermischer Energie. Dabei soll die grundlastfähige, dezentrale und permanent verfügbare Energiequelle helfen, fossile Energieträger zu ersetzen. Aktuell konzentriert sich die Erschließung geothermischer Lagerstätten vor allem auf konventionelle High-Enthalpy Ressourcen, die oftmals in Zusammenhang mit Vulkanismus oder Magmatismus an aktiven Kontinentalrändern oder Rifting-Prozessen auftreten. Die aktiven Kontinentalränder, die den Pazifik umspannen (auch "pazifischer Feuerring" genannt), werden von vielen Anrainern geothermisch genutzt. Lediglich in der Andenregion konnten bislang keine nennenswerten geothermischen Ressourcen erschlossen werden. Chile hat, nach Inbetriebnahme des ersten geothermischen Kraftwerks, begonnen das geothermische Potential systematisch zu entwickeln. Dabei sollen, um eine nachhaltige Energieversorgung zu gewährleisten, neben der Erschließung von High-Enthalpy Lagerstätten auch Low/Medium-Enthalpy Reservoire genutzt werden. Global gesehen sind Low/Medium-Enthalpy Reservoire oft an große Störungssysteme oder geothermische geeignete Gesteinsformationen gebunden. Zur Auffindung und Charakterisierung der Lagerstätten bedarf es einer angepassten Explorationsstrategie, da klassische geothermische Exploration auf High-Enthalpy Ressourcen ausgelegt ist. Im Rahmen dieser Doktorarbeit soll eine Explorationsstrategie für Low/Medium Enthalpy Geothermierreservoire in Chile entwickelt werden. Als Forschungsstandort wurde das Geothermalsystem am Vulkan Villarrica gewählt, da der Erfolg der Explorationsstrategie zur Charakterisierung des komplexen Störungszonensystems und eines markanten Lithologie Wechsel getestet werden kann. Um sowohl die Reservoirgeometrie als auch Reservoirprozesse quantifizieren zu können, wurde ein interdisziplinarer Ansatz gewählt, der geochemische und geophysikalische Methoden koppelt. Störungszonensysteme besitzen eine übergeordnete Bedeutung zur Ausbildung des geothermischen Zirkulationssystems und somit zur Bildung der Lagerstätte. Der Forschungsstandort ist gekennzeichnet durch das Schneiden zweier überregionaler Störungszonen, der Liquiñe-Ofqui Störungssystem (LOFS) und der Mocha-Villarrica Störungszone (MVFZ), die mit geophysikalischen Methoden untersucht werden. Mit Hilfe hoch aufgelöster magnetotellurischer Messungen können beide Störungszonen durch verminderte elektrische Widerstände identifiziert werden. Diese Widerstandsreduktion wird durch das Auftreten von leitfähigen geothermischen Tiefenwässern und/oder hydrothermalen Alterationsprodukten hervorgerufen werden. Für die MVFZ zeigen die Untersuchungen eine nordwärts einfallende Störungszone, die mit einer Zone erhöhter elektrischer Leitfähigkeit in der mittleren Kruste verbunden ist. Der Ausbiss der Störungszone fällt mit der Lage der Villarrica-Quetrupillán- Lanín Vulkankette zusammen. Die LOFS zeigt sich als vertikale Zone erhöhter Leitfähigkeit, die sich von der Erdoberfläche bis zum Spröd-Duktilen Übergang erstreckt. Ein mögliches Eindringen in den duktilen Bereich mit potentieller Verbindung zu einer vorhandenen Zone erhöhter Leitfähigkeit in der mittleren Kruste wird durch erhöhte Leitfähigkeiten der duktilen Kruste maskiert. Parallel zu den MT Profilen werden gravimetrische Messungen durchgeführt. Die LOFS zeichnet sich durch eine markante negative Bouguer Anomalie aus, die räumlich mit den erhöhten Leitfähigkeiten übereinstimmt. Die Anwendung von Butterworth Filtern in Kombination mit gravimetrischer Modellierung ermöglicht die Bestimmung der Störungszonengeometrie und die Quantifizierung des Dichtekontrasts. In einer gemeinsamen Interpretation magnetotellurischer und gravimetrischer Daten können die Eigenschaften der LOFS in Bezug auf Tonmineralgehalt und Porosität berechnet werden um die Permeabilität der Störungszone abzuschätzen. Mit Hilfe geochemischer Methoden sollen die Reservoirprozesse charakterisiert werden. Dazu werden die Thermalwasseraustritte als Fenster zum Untergrund genutzt um den Ursprung und die Genese der Thermalwässer zu bestimmen. Es kann gezeigt werden, dass die Thermalwässer meteorische Ursprungs sind und durch intensive Reaktion mit Kristallin Gestein entstehen. Obwohl räumliche Nähe zu aktiven Vulkanen besteht, kann kein substantieller Einfluss magmatischer Fluide oder Gase festgestellt werden. Nachfolgend werden die Gesteins-Wasser Wechselwirkungen durch eine vergleichende Studie der Thermalwässer und möglicher Reservoirgesteine untersucht. Dabei wird der markante Lithologie Kontrast, zwischen plutonischen Gesteinen des Nord Patagonischen Batholiths (NPB) und vulkano-klastischen Gesteinen der Cura-Mallín Formation, durch die Analyse von Strontium Isotopen nachgezeichnet. Durch Analyse von FCKW Spezies und Sauerstoff Isotopen des SO4-H2O Systems kann gezeigt werden, dass in beiden Formationen unterschiedliche Fluidzirkulationssysteme auftreten. Im NPB kommt es zu einer Konzentration der Thermalwasserzirkulation auf Hauptstörungszonen, wohingegen für die Cura-Mallín Formation eine verzweigtere Fluidzirkulation nachgewiesen werden kann. Die Analyse der verschiedenen FCKW Spezies ermöglicht die Quantifizierung der Vermischungsprozesse im Untergrund und kann so genutzt werden um die in-situ Thermalwasserzusammensetzung zu ermitteln. Erst diese ermöglicht eine genaue Bestimmung der Reservoirbedingungen und des geothermischen Potentials. Der Villarrica Geothermalsystem besitzt ein erhöhtes geothermisches Potential. Durch Thermalwasseraufstieg entlang der Hauptstörungszonen bilden sich Reservoire in erschließbarer Tiefe. Als unterirdischer Wärmetauscher eignet sich vor allem die Cura-Mallín Formation durch das verzweigte Fließfeld. Maximale Reservoirtemperaturen 140–180°C eignen sich beispielsweise zur Wärmeversorgung der Stadt Pucón durch eine Fernwärmesystem

    Novel parallel approaches to efficiently solve spatial problems on heterogeneous CPU-GPU systems

    Get PDF
    Addressing this task is difficult as (i) it requires analysing large databases in a short time, and (ii) it is commonly addressed by combining different methods with complex data dependencies, making it challenging to exploit parallelism on heterogeneous CPU-GPU systems. Moreover, most efforts in this context focus on improving the accuracy of the approaches and neglect reducing the processing time—the most accurate algorithm was designed to process the fingerprints using a single thread. We developed a new methodology to address the latent fingerprint identification problem called “Asynchronous processing for Latent Fingerprint Identification” (ALFI) that speeds up processing while maintaining high accuracy. ALFI exploits all the resources of CPU-GPU systems using asynchronous processing and fine-coarse parallelism to analyse massive fingerprint databases. We assessed the performance of ALFI on Linux and Windows operating systems using the well-known NIST/FVC databases. Experimental results revealed that ALFI is on average 22x faster than the state-of-the-art identification algorithm, reaching a speed-up of 44.7x for the best-studied case. In terrain analysis, Digital Elevation Models (DEMs) are relevant datasets used as input to those algorithms that typically sweep the terrain to analyse its main topological features such as visibility, elevation, and slope. The most challenging computation related to this topic is the total viewshed problem. It involves computing the viewshed—the visible area of the terrain—for each of the points in the DEM. The algorithms intended to solve this problem require many memory accesses to 2D arrays, which, despite being regular, lead to poor data locality in memory. We proposed a methodology called “skewed Digital Elevation Model” (sDEM) that substantially improves the locality of memory accesses and exploits the inherent parallelism of rotational sweep-based algorithms. Particularly, sDEM applies a data relocation technique before accessing the memory and computing the viewshed, thus significantly reducing the execution time. Different implementations are provided for single-core, multi-core, single-GPU, and multi-GPU platforms. We carried out two experiments to compare sDEM with (i) the most used geographic information systems (GIS) software and (ii) the state-of-the-art algorithm for solving the total viewshed problem. In the first experiment, sDEM results on average 8.8x faster than current GIS software, despite considering only a few points because of the limitations of the GIS software. In the second experiment, sDEM is 827.3x faster than the state-of-the-art algorithm considering the best case. The use of Unmanned Aerial Vehicles (UAVs) with multiple onboard sensors has grown enormously in tasks involving terrain coverage, such as environmental and civil monitoring, disaster management, and forest fire fighting. Many of these tasks require a quick and early response, which makes maximising the land covered from the flight path an essential goal, especially when the area to be monitored is irregular, large, and includes many blind spots. In this regard, state-of-the-art total viewshed algorithms can help analyse large areas and find new paths providing all-round visibility. We designed a new heuristic called “Visibility-based Path Planning” (VPP) to solve the path planning problem in large areas based on a thorough visibility analysis. VPP generates flyable paths that provide high visual coverage to monitor forest regions using the onboard camera of a single UAV. For this purpose, the hidden areas of the target territory are identified and considered when generating the path. Simulation results showed that VPP covers up to 98.7% of the Montes de Malaga Natural Park and 94.5% of the Sierra de las Nieves National Park, both located in the province of Malaga (Spain). In addition, a real flight test confirmed the high visibility achieved using VPP. Our methodology and analysis can be easily applied to enhance monitoring in other large outdoor areas.In recent years, approaches that seek to extract valuable information from large datasets have become particularly relevant in today's society. In this category, we can highlight those problems that comprise data analysis distributed across two-dimensional scenarios called spatial problems. These usually involve processing (i) a series of features distributed across a given plane or (ii) a matrix of values where each cell corresponds to a point on the plane. Therefore, we can see the open-ended and complex nature of spatial problems, but it also leaves room for imagination to be applied in the search for new solutions. One of the main complications we encounter when dealing with spatial problems is that they are very computationally intensive, typically taking a long time to produce the desired result. This drawback is also an opportunity to use heterogeneous systems to address spatial problems more efficiently. Heterogeneous systems give the developer greater freedom to speed up suitable algorithms by increasing the parallel programming options available, making it possible for different parts of a program to run on the dedicated hardware that suits them best. Several of the spatial problems that have not been optimised for heterogeneous systems cover very diverse areas that seem vastly different at first sight. However, they are closely related due to common data processing requirements, making them suitable for using dedicated hardware. In particular, this thesis provides new parallel approaches to tackle the following three crucial spatial problems: latent fingerprint identification, total viewshed computation, and path planning based on maximising visibility in large regions. Latent fingerprint identification is one of the essential identification procedures in criminal investigations. Addressing this task is difficult as (i) it requires analysing large databases in a short time, and (ii) it is commonly addressed by combining different methods with complex data dependencies, making it challenging to exploit parallelism on heterogeneous CPU-GPU systems. Moreover, most efforts in this context focus on improving the accuracy of the approaches and neglect reducing the processing time—the most accurate algorithm was designed to process the fingerprints using a single thread. We developed a new methodology to address the latent fingerprint identification problem called “Asynchronous processing for Latent Fingerprint Identification” (ALFI) that speeds up processing while maintaining high accuracy. ALFI exploits all the resources of CPU-GPU systems using asynchronous processing and fine-coarse parallelism to analyse massive fingerprint databases. We assessed the performance of ALFI on Linux and Windows operating systems using the well-known NIST/FVC databases. Experimental results revealed that ALFI is on average 22x faster than the state-of-the-art identification algorithm, reaching a speed-up of 44.7x for the best-studied case. In terrain analysis, Digital Elevation Models (DEMs) are relevant datasets used as input to those algorithms that typically sweep the terrain to analyse its main topological features such as visibility, elevation, and slope. The most challenging computation related to this topic is the total viewshed problem. It involves computing the viewshed—the visible area of the terrain—for each of the points in the DEM. The algorithms intended to solve this problem require many memory accesses to 2D arrays, which, despite being regular, lead to poor data locality in memory. We proposed a methodology called “skewed Digital Elevation Model” (sDEM) that substantially improves the locality of memory accesses and exploits the inherent parallelism of rotational sweep-based algorithms. Particularly, sDEM applies a data relocation technique before accessing the memory and computing the viewshed, thus significantly reducing the execution time. Different implementations are provided for single-core, multi-core, single-GPU, and multi-GPU platforms. We carried out two experiments to compare sDEM with (i) the most used geographic information systems (GIS) software and (ii) the state-of-the-art algorithm for solving the total viewshed problem. In the first experiment, sDEM results on average 8.8x faster than current GIS software, despite considering only a few points because of the limitations of the GIS software. In the second experiment, sDEM is 827.3x faster than the state-of-the-art algorithm considering the best case. The use of Unmanned Aerial Vehicles (UAVs) with multiple onboard sensors has grown enormously in tasks involving terrain coverage, such as environmental and civil monitoring, disaster management, and forest fire fighting. Many of these tasks require a quick and early response, which makes maximising the land covered from the flight path an essential goal, especially when the area to be monitored is irregular, large, and includes many blind spots. In this regard, state-of-the-art total viewshed algorithms can help analyse large areas and find new paths providing all-round visibility. We designed a new heuristic called “Visibility-based Path Planning” (VPP) to solve the path planning problem in large areas based on a thorough visibility analysis. VPP generates flyable paths that provide high visual coverage to monitor forest regions using the onboard camera of a single UAV. For this purpose, the hidden areas of the target territory are identified and considered when generating the path. Simulation results showed that VPP covers up to 98.7% of the Montes de Malaga Natural Park and 94.5% of the Sierra de las Nieves National Park, both located in the province of Malaga (Spain). In addition, a real flight test confirmed the high visibility achieved using VPP. Our methodology and analysis can be easily applied to enhance monitoring in other large outdoor areas

    40th Rocky Mountain Conference on Analytical Chemistry

    Get PDF
    Final program, abstracts, and information about the 40th annual meeting of the Rocky Mountain Conference on Analytical Chemistry, co-sponsored by the Colorado Section of the American Chemical Society and the Rocky Mountain Section of the Society for Applied Spectroscopy. Held in Denver, Colorado, July 25 - August 1, 1998

    Proceedings of the Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015) Krakow, Poland

    Get PDF
    Proceedings of: Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015). Krakow (Poland), September 10-11, 2015

    Multi-Fidelity Gaussian Process Emulation And Its Application In The Study Of Tsunami Risk Modelling

    Get PDF
    Investigating uncertainties in computer simulations can be prohibitive in terms of computational costs, since the simulator needs to be run over a large number of input values. Building a statistical surrogate model of the simulator, using a small design of experiments, greatly alleviates the computational burden to carry out such investigations. Nevertheless, this can still be above the computational budget for many studies. We present a novel method that combines both approaches, the multilevel adaptive sequential design of computer experiments (MLASCE) in the framework of Gaussian process (GP) emulators. MLASCE is based on the two major approaches: efficient design of experiments, such as sequential designs, and combining training data of different degrees of sophistication in a so-called multi-fidelity method, or multilevel in case these fidelities are ordered typically for increasing resolutions. This dual strategy allows us to allocate efficiently limited computational resources over simulations of different levels of fidelity and build the GP emulator. The allocation of computational resources is shown to be the solution of a simple optimization problem in a special case where we theoretically prove the validity of our approach. MLASCE is compared with other existing models of multi-fidelity Gaussian process emulation. Gains of orders of magnitudes in accuracy for medium-size computing budgets are demonstrated in numerical examples. MLASCE should be useful in a computer experiment of a natural disaster risk and more than a mere tool for calculating the scale of natural disasters. To show MLASCE meets this expectation, we propose the first end-to-end example of a risk model for household asset loss due to a possible future tsunami. As a follow-up to this proposed framework, MLASCE provides a reliable statistical surrogate to a realistic tsunami risk assessment under a restricted computational resource and provides accurate and instant predictions of future tsunami risks

    The materials processing research base of the Materials Processing Center

    Get PDF
    The goals and activities of the center are discussed. The center activities encompass all engineering materials including metals, ceramics, polymers, electronic materials, composites, superconductors, and thin films. Processes include crystallization, solidification, nucleation, and polymer synthesis
    • …
    corecore