401 research outputs found
Proactive Interference-aware Resource Management in Deep Learning Training Cluster
Deep Learning (DL) applications are growing at an unprecedented rate across many domains, ranging from weather prediction, map navigation to medical imaging. However, training these deep learning models in large-scale compute clusters face substantial challenges in terms of low cluster resource utilisation and high job waiting time. State-of-the-art DL cluster resource managers are needed to increase GPU utilisation and maximise throughput. While co-locating DL jobs within the same GPU has been shown to be an effective means towards achieving this, co-location subsequently incurs performance interference resulting in job slowdown. We argue that effective workload placement can minimise DL cluster interference at scheduling runtime by understanding the DL workload characteristics and their respective hardware resource consumption. However, existing DL cluster resource managers reserve isolated GPUs to perform online profiling to directly measure GPU utilisation and kernel patterns for each unique submitted job. Such a feedback-based reactive approach results in additional waiting times as well as reduced cluster resource efficiency and availability. In this thesis, we propose Horus: an interference-aware and prediction-based DL cluster resource manager. Through empirically studying a series of microbenchmarks and DL workload co-location combinations across heterogeneous GPU hardware, we demonstrate the negative effects of performance interference when colocating DL workload, and identify GPU utilisation as a general proxy metric to determine good placement decisions. From these findings, we design Horus, which in contrast to existing approaches, proactively predicts GPU utilisation of heterogeneous DL workload extrapolated from the DL model computation graph features when performing placement decisions, removing the need for online profiling and isolated reserved GPUs. By conducting empirical experimentation within a medium-scale DL cluster as well as a large-scale trace-driven simulation of a production system, we demonstrate Horus improves cluster GPU utilisation, reduces cluster makespan and waiting time, and can scale to operate within hundreds of machines
Exploration of Villarrica Geothermal System using Geophysical and Geochemical Techniques [Finale Version]
FĂĽr den globalen, zukĂĽnftigen Energiemix prognostiziert die internationale Energieagentur
(IEA) einen erheblichen Beitrag aus geothermischer Energie. Dabei soll die grundlastfähige,
dezentrale und permanent verfügbare Energiequelle helfen, fossile Energieträger
zu ersetzen. Aktuell konzentriert sich die Erschließung geothermischer Lagerstätten vor
allem auf konventionelle High-Enthalpy Ressourcen, die oftmals in Zusammenhang mit
Vulkanismus oder Magmatismus an aktiven Kontinentalrändern oder Rifting-Prozessen
auftreten. Die aktiven Kontinentalränder, die den Pazifik umspannen (auch "pazifischer
Feuerring" genannt), werden von vielen Anrainern geothermisch genutzt. Lediglich in der
Andenregion konnten bislang keine nennenswerten geothermischen Ressourcen erschlossen
werden. Chile hat, nach Inbetriebnahme des ersten geothermischen Kraftwerks, begonnen
das geothermische Potential systematisch zu entwickeln. Dabei sollen, um eine nachhaltige
Energieversorgung zu gewährleisten, neben der Erschließung von High-Enthalpy
Lagerstätten auch Low/Medium-Enthalpy Reservoire genutzt werden.
Global gesehen sind Low/Medium-Enthalpy Reservoire oft an große Störungssysteme oder
geothermische geeignete Gesteinsformationen gebunden. Zur Auffindung und Charakterisierung
der Lagerstätten bedarf es einer angepassten Explorationsstrategie, da klassische
geothermische Exploration auf High-Enthalpy Ressourcen ausgelegt ist. Im Rahmen dieser
Doktorarbeit soll eine Explorationsstrategie fĂĽr Low/Medium Enthalpy Geothermierreservoire
in Chile entwickelt werden. Als Forschungsstandort wurde das Geothermalsystem am
Vulkan Villarrica gewählt, da der Erfolg der Explorationsstrategie zur Charakterisierung
des komplexen Störungszonensystems und eines markanten Lithologie Wechsel getestet werden
kann. Um sowohl die Reservoirgeometrie als auch Reservoirprozesse quantifizieren zu
können, wurde ein interdisziplinarer Ansatz gewählt, der geochemische und geophysikalische
Methoden koppelt.
Störungszonensysteme besitzen eine übergeordnete Bedeutung zur Ausbildung des geothermischen
Zirkulationssystems und somit zur Bildung der Lagerstätte. Der Forschungsstandort
ist gekennzeichnet durch das Schneiden zweier überregionaler Störungszonen, der
Liquiñe-Ofqui Störungssystem (LOFS) und der Mocha-Villarrica Störungszone (MVFZ),
die mit geophysikalischen Methoden untersucht werden. Mit Hilfe hoch aufgelöster magnetotellurischer
Messungen können beide Störungszonen durch verminderte elektrische
Widerstände identifiziert werden. Diese Widerstandsreduktion wird durch das Auftreten von
leitfähigen geothermischen Tiefenwässern und/oder hydrothermalen Alterationsprodukten
hervorgerufen werden. Für die MVFZ zeigen die Untersuchungen eine nordwärts einfallende
Störungszone, die mit einer Zone erhöhter elektrischer Leitfähigkeit in der mittleren Kruste
verbunden ist. Der Ausbiss der Störungszone fällt mit der Lage der Villarrica-Quetrupillán-
LanĂn Vulkankette zusammen. Die LOFS zeigt sich als vertikale Zone erhöhter Leitfähigkeit,
die sich von der Erdoberfläche bis zum Spröd-Duktilen Übergang erstreckt. Ein mögliches
Eindringen in den duktilen Bereich mit potentieller Verbindung zu einer vorhandenen
Zone erhöhter Leitfähigkeit in der mittleren Kruste wird durch erhöhte Leitfähigkeiten der
duktilen Kruste maskiert. Parallel zu den MT Profilen werden gravimetrische Messungen
durchgefĂĽhrt. Die LOFS zeichnet sich durch eine markante negative Bouguer Anomalie
aus, die räumlich mit den erhöhten Leitfähigkeiten übereinstimmt. Die Anwendung von
Butterworth Filtern in Kombination mit gravimetrischer Modellierung ermöglicht die Bestimmung
der Störungszonengeometrie und die Quantifizierung des Dichtekontrasts. In einer
gemeinsamen Interpretation magnetotellurischer und gravimetrischer Daten können die
Eigenschaften der LOFS in Bezug auf Tonmineralgehalt und Porosität berechnet werden
um die Permeabilität der Störungszone abzuschätzen.
Mit Hilfe geochemischer Methoden sollen die Reservoirprozesse charakterisiert werden.
Dazu werden die Thermalwasseraustritte als Fenster zum Untergrund genutzt um den
Ursprung und die Genese der Thermalwässer zu bestimmen. Es kann gezeigt werden,
dass die Thermalwässer meteorische Ursprungs sind und durch intensive Reaktion mit
Kristallin Gestein entstehen. Obwohl räumliche Nähe zu aktiven Vulkanen besteht, kann
kein substantieller Einfluss magmatischer Fluide oder Gase festgestellt werden. Nachfolgend
werden die Gesteins-Wasser Wechselwirkungen durch eine vergleichende Studie der
Thermalwässer und möglicher Reservoirgesteine untersucht. Dabei wird der markante
Lithologie Kontrast, zwischen plutonischen Gesteinen des Nord Patagonischen Batholiths
(NPB) und vulkano-klastischen Gesteinen der Cura-MallĂn Formation, durch die Analyse
von Strontium Isotopen nachgezeichnet. Durch Analyse von FCKW Spezies und Sauerstoff
Isotopen des SO4-H2O Systems kann gezeigt werden, dass in beiden Formationen unterschiedliche
Fluidzirkulationssysteme auftreten. Im NPB kommt es zu einer Konzentration
der Thermalwasserzirkulation auf Hauptstörungszonen, wohingegen fĂĽr die Cura-MallĂn
Formation eine verzweigtere Fluidzirkulation nachgewiesen werden kann. Die Analyse der
verschiedenen FCKW Spezies ermöglicht die Quantifizierung der Vermischungsprozesse im
Untergrund und kann so genutzt werden um die in-situ Thermalwasserzusammensetzung
zu ermitteln. Erst diese ermöglicht eine genaue Bestimmung der Reservoirbedingungen
und des geothermischen Potentials.
Der Villarrica Geothermalsystem besitzt ein erhöhtes geothermisches Potential. Durch Thermalwasseraufstieg
entlang der Hauptstörungszonen bilden sich Reservoire in erschließbarer
Tiefe. Als unterirdischer Wärmetauscher eignet sich vor allem die Cura-MallĂn Formation
durch das verzweigte Fließfeld. Maximale Reservoirtemperaturen 140–180°C eignen sich
beispielsweise zur Wärmeversorgung der Stadt Pucón durch eine Fernwärmesystem
Novel parallel approaches to efficiently solve spatial problems on heterogeneous CPU-GPU systems
Addressing this task is difficult as (i) it requires analysing large databases in a short time, and (ii) it is commonly addressed by combining different methods with complex data dependencies, making it challenging to exploit parallelism on heterogeneous CPU-GPU systems. Moreover, most efforts in this context focus on improving the accuracy of the approaches and neglect reducing the processing time—the most accurate algorithm was designed to process the fingerprints using a single thread. We developed a new methodology to address the latent fingerprint identification problem called “Asynchronous processing for Latent Fingerprint Identification” (ALFI) that speeds up processing while maintaining high accuracy. ALFI exploits all the resources of CPU-GPU systems using asynchronous processing and fine-coarse parallelism to analyse massive fingerprint databases. We assessed the performance of ALFI on Linux and Windows operating systems using the well-known NIST/FVC databases. Experimental results revealed that ALFI is on average 22x faster than the state-of-the-art identification algorithm, reaching a speed-up of 44.7x for the best-studied case.
In terrain analysis, Digital Elevation Models (DEMs) are relevant datasets used as input to those algorithms that typically sweep the terrain to analyse its main topological features such as visibility, elevation, and slope. The most challenging computation related to this topic is the total viewshed problem. It involves computing the viewshed—the visible area of the terrain—for each of the points in the DEM. The algorithms intended to solve this problem require many memory accesses to 2D arrays, which, despite being regular, lead to poor data locality in memory. We proposed a methodology called “skewed Digital Elevation Model” (sDEM) that substantially improves the locality of memory accesses and exploits the inherent parallelism of rotational sweep-based algorithms. Particularly, sDEM applies a data relocation technique before accessing the memory and computing the viewshed, thus significantly reducing the execution time. Different implementations are provided for single-core, multi-core, single-GPU, and multi-GPU platforms. We carried out two experiments to compare sDEM with (i) the most used geographic information systems (GIS) software and (ii) the state-of-the-art algorithm for solving the total viewshed problem. In the first experiment, sDEM results on average 8.8x faster than current GIS software, despite considering only a few points because of the limitations of the GIS software. In the second experiment, sDEM is 827.3x faster than the state-of-the-art algorithm considering the best case.
The use of Unmanned Aerial Vehicles (UAVs) with multiple onboard sensors has grown enormously in tasks involving terrain coverage, such as environmental and civil monitoring, disaster management, and forest fire fighting. Many of these tasks require a quick and early response, which makes maximising the land covered from the flight path an essential goal, especially when the area to be monitored is irregular, large, and includes many blind spots. In this regard, state-of-the-art total viewshed algorithms can help analyse large areas and find new paths providing all-round visibility. We designed a new heuristic called “Visibility-based Path Planning” (VPP) to solve the path planning problem in large areas based on a thorough visibility analysis. VPP generates flyable paths that provide high visual coverage to monitor forest regions using the onboard camera of a single UAV. For this purpose, the hidden areas of the target territory are identified and considered when generating the path. Simulation results showed that VPP covers up to 98.7% of the Montes de Malaga Natural Park and 94.5% of the Sierra de las Nieves National Park, both located in the province of Malaga (Spain). In addition, a real flight test confirmed the high visibility achieved using VPP. Our methodology and analysis can be easily applied to enhance monitoring in other large outdoor areas.In recent years, approaches that seek to extract valuable information from large datasets have become particularly relevant in today's society. In this category, we can highlight those problems that comprise data analysis distributed across two-dimensional scenarios called spatial problems. These usually involve processing (i) a series of features distributed across a given plane or (ii) a matrix of values where each cell corresponds to a point on the plane. Therefore, we can see the open-ended and complex nature of spatial problems, but it also leaves room for imagination to be applied in the search for new solutions.
One of the main complications we encounter when dealing with spatial problems is that they are very computationally intensive, typically taking a long time to produce the desired result. This drawback is also an opportunity to use heterogeneous systems to address spatial problems more efficiently. Heterogeneous systems give the developer greater freedom to speed up suitable algorithms by increasing the parallel programming options available, making it possible for different parts of a program to run on the dedicated hardware that suits them best.
Several of the spatial problems that have not been optimised for heterogeneous systems cover very diverse areas that seem vastly different at first sight. However, they are closely related due to common data processing requirements, making them suitable for using dedicated hardware. In particular, this thesis provides new parallel approaches to tackle the following three crucial spatial problems: latent fingerprint identification, total viewshed computation, and path planning based on maximising visibility in large regions.
Latent fingerprint identification is one of the essential identification procedures in criminal investigations. Addressing this task is difficult as (i) it requires analysing large databases in a short time, and (ii) it is commonly addressed by combining different methods with complex data dependencies, making it challenging to exploit parallelism on heterogeneous CPU-GPU systems. Moreover, most efforts in this context focus on improving the accuracy of the approaches and neglect reducing the processing time—the most accurate algorithm was designed to process the fingerprints using a single thread. We developed a new methodology to address the latent fingerprint identification problem called “Asynchronous processing for Latent Fingerprint Identification” (ALFI) that speeds up processing while maintaining high accuracy. ALFI exploits all the resources of CPU-GPU systems using asynchronous processing and fine-coarse parallelism to analyse massive fingerprint databases. We assessed the performance of ALFI on Linux and Windows operating systems using the well-known NIST/FVC databases. Experimental results revealed that ALFI is on average 22x faster than the state-of-the-art identification algorithm, reaching a speed-up of 44.7x for the best-studied case.
In terrain analysis, Digital Elevation Models (DEMs) are relevant datasets used as input to those algorithms that typically sweep the terrain to analyse its main topological features such as visibility, elevation, and slope. The most challenging computation related to this topic is the total viewshed problem. It involves computing the viewshed—the visible area of the terrain—for each of the points in the DEM. The algorithms intended to solve this problem require many memory accesses to 2D arrays, which, despite being regular, lead to poor data locality in memory. We proposed a methodology called “skewed Digital Elevation Model” (sDEM) that substantially improves the locality of memory accesses and exploits the inherent parallelism of rotational sweep-based algorithms. Particularly, sDEM applies a data relocation technique before accessing the memory and computing the viewshed, thus significantly reducing the execution time. Different implementations are provided for single-core, multi-core, single-GPU, and multi-GPU platforms. We carried out two experiments to compare sDEM with (i) the most used geographic information systems (GIS) software and (ii) the state-of-the-art algorithm for solving the total viewshed problem. In the first experiment, sDEM results on average 8.8x faster than current GIS software, despite considering only a few points because of the limitations of the GIS software. In the second experiment, sDEM is 827.3x faster than the state-of-the-art algorithm considering the best case.
The use of Unmanned Aerial Vehicles (UAVs) with multiple onboard sensors has grown enormously in tasks involving terrain coverage, such as environmental and civil monitoring, disaster management, and forest fire fighting. Many of these tasks require a quick and early response, which makes maximising the land covered from the flight path an essential goal, especially when the area to be monitored is irregular, large, and includes many blind spots. In this regard, state-of-the-art total viewshed algorithms can help analyse large areas and find new paths providing all-round visibility. We designed a new heuristic called “Visibility-based Path Planning” (VPP) to solve the path planning problem in large areas based on a thorough visibility analysis. VPP generates flyable paths that provide high visual coverage to monitor forest regions using the onboard camera of a single UAV. For this purpose, the hidden areas of the target territory are identified and considered when generating the path. Simulation results showed that VPP covers up to 98.7% of the Montes de Malaga Natural Park and 94.5% of the Sierra de las Nieves National Park, both located in the province of Malaga (Spain). In addition, a real flight test confirmed the high visibility achieved using VPP. Our methodology and analysis can be easily applied to enhance monitoring in other large outdoor areas
40th Rocky Mountain Conference on Analytical Chemistry
Final program, abstracts, and information about the 40th annual meeting of the Rocky Mountain Conference on Analytical Chemistry, co-sponsored by the Colorado Section of the American Chemical Society and the Rocky Mountain Section of the Society for Applied Spectroscopy. Held in Denver, Colorado, July 25 - August 1, 1998
Proceedings of the Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015) Krakow, Poland
Proceedings of: Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015). Krakow (Poland), September 10-11, 2015
Multi-Fidelity Gaussian Process Emulation And Its Application In The Study Of Tsunami Risk Modelling
Investigating uncertainties in computer simulations can be prohibitive in terms of computational costs, since the simulator needs to be run over a large number of input values. Building a statistical surrogate model of the simulator, using a small design of experiments, greatly alleviates the computational burden to carry out such investigations. Nevertheless, this can still be above the computational budget for many studies. We present a novel method that combines both approaches, the multilevel adaptive sequential design of computer experiments (MLASCE) in the framework of Gaussian process (GP) emulators. MLASCE is based on the two major approaches: efficient design of experiments, such as sequential designs, and combining training data of different degrees of sophistication in a so-called multi-fidelity method, or multilevel in case these fidelities are ordered typically for increasing resolutions. This dual strategy allows us to allocate efficiently limited computational resources over simulations of different levels of fidelity and build the GP emulator. The allocation of computational resources is shown to be the solution of a simple optimization problem in a special case where we theoretically prove the validity of our approach. MLASCE is compared with other existing models of multi-fidelity Gaussian process emulation. Gains of orders of magnitudes in accuracy for medium-size computing budgets are demonstrated in numerical examples. MLASCE should be useful in a computer experiment of a natural disaster risk and more than a mere tool for calculating the scale of natural disasters. To show MLASCE meets this expectation, we propose the first end-to-end example of a risk model for household asset loss due to a possible future tsunami. As a follow-up to this proposed framework, MLASCE provides a reliable statistical surrogate to a realistic tsunami risk assessment under a restricted computational resource and provides accurate and instant predictions of future tsunami risks
The materials processing research base of the Materials Processing Center
The goals and activities of the center are discussed. The center activities encompass all engineering materials including metals, ceramics, polymers, electronic materials, composites, superconductors, and thin films. Processes include crystallization, solidification, nucleation, and polymer synthesis
- …