815 research outputs found

    Combining in-situ and in-transit processing to enable extreme-scale scientific analysis

    Get PDF
    pre-printWith the onset of extreme-scale computing, I/O constraints make it increasingly difficult for scientists to save a sufficient amount of raw simulation data to persistent storage. One potential solution is to change the data analysis pipeline from a post-process centric to a concurrent approach based on either in-situ or in-transit processing. In this context computations are considered in-situ if they utilize the primary compute resources, while in-transit processing refers to offloading computations to a set of secondary resources using asynchronous data transfers. In this paper we explore the design and implementation of three common analysis techniques typically performed on large-scale scientific simulations: topological analysis, descriptive statistics, and visualization. We summarize algorithmic developments, describe a resource scheduling system to coordinate the execution of various analysis workflows, and discuss our implementation using the DataSpaces and ADIOS frameworks that support efficient data movement between in-situ and in-transit computations. We demonstrate the efficiency of our lightweight, flexible framework by deploying it on the Jaguar XK6 to analyze data generated by S3D, a massively parallel turbulent combustion code. Our framework allows scientists dealing with the data deluge at extreme scale to perform analyses at increased temporal resolutions, mitigate I/O costs, and significantly improve the time to insight

    An integrated software approach to interactive exploration and steering of fluid flow simulations on many-core architectures

    Get PDF
    Traditionell werden numerische Strömungssimulationen in einer zyklischen Sequenz autonomer Teilschritte durchgeführt. Seitens Wissenschaftlern existiert jedoch schon lange der Wunsch nach mehr Interaktion mit laufenden Simulationen. Seit dem maßgeblichen Report der National Science Foundation im Jahre 1987 wurden daher neue Formen der wissenschaftlichen Visualisierung entwickelt, die sich grundlegend von den traditionellen Verfahren unterscheiden. Insbesondere hat der sogenannte Computational Steering-Ansatz reges Interesse bewirkt. Damals wie heute ist die Anwendung des Verfahrens jedoch eher die Ausnahme denn die Regel. Ursächlich dafür sind zu großen Teilen Komplexität und Restriktionen traditioneller Hochleistungssysteme. Im Rahmen dieser Arbeit wird daher als Alternative zu dem traditionellen Vorgehen die immense Leistungsfähigkeit moderner Grafikkartengenerationen für die Berechnungen herangezogen. Das sogenannte GPGPU-Computing eignet sich insbesondere für die Anwendung der Lattice-Boltzmann-Methode im Bereich numerischer Strömungssimulationen. Auf Grundlage des LBM-Verfahrens wird im Rahmen dieser Arbeit prototypisch eine interaktive Simulationsumgebung basierend auf dem Computational Steering-Paradigma entwickelt, das alle Prozesse zur Lösung von Strömungsproblemen innerhalb einer einzelnen Anwendung integriert. Durch die Konvergenz der hohen massiv parallelen Rechenleistung der GPUs und der Interaktionsfähigkeiten in einer einzelnen Anwendung kann eine erhebliche Steigerung der Anwendungsqualität erzielt werden. Dabei ist es durch Einsatz mehrerer GPUs möglich, dreidimensionale Strömungsprobleme mit praxisrelevanter Problemgröße zu berechnen und gleichzeitig eine interaktive Manipulation und Exploration des Strömungsgebiets zur Laufzeit zu ermöglichen. Dabei ist der erforderliche finanzielle Aufwand verglichen mit traditionellen massiv parallelen Verfahren verhältnismäßig gering.Traditionally, computational fluid dynamics is done in a cyclic sequence of independent steps. Howerver it is a long term wish of scientists and engineers to closely interact with their running simulations. Since the influential report of the US National Science Foundation in 1987 new forms of scientific visualization have evolved that are quite different from traditional post-processing. Especially the approach commonly referred to as computational steering has been the subject of widespread interest. Although it is a very powerful paradigm, the use of computational steering is still the exception rather than the rule. The reasons for this are more or less related to the complexity and restrictions of traditional HPC systems. As an alternative to the traditional massively parallel approach, in this thesis the parallel computational power of GPGPUs is used for general purpose applications. The so called GPGPU computing has gained large popularity in the CFD community, especially for its application to the lattice Boltzmann method. Using this technology this work demonstrates a single desktop application integrating a complete interactive CFD simulation environment for reasonable hardware costs. It shows that the convergence of massive parallel computational power and steering environment into a single system significantly improves the usability, application quality and user-friendliness. Using multiple GPUs, the efficiency of this approach allows for CFD simulations in three dimensional space evolving close to real-time even for reasonable grid sizes. Thereby, the simulation can be explored and also adjusted during runtime. The thesis also shows that the responsiveness significantly benefits from avoiding common bandwidth and latency bottlenecks inherent in traditional HPC approaches. Those can be avoided as GPGPU computing does not generally require network communication, which also reduces the complexity of the application

    Dataflow methods in HPC, visualisation and analysis

    Get PDF
    The processing power available to scientists and engineers using supercomputers over the last few decades has grown exponentially, permitting significantly more sophisticated simulations, and as a consequence, generating proportionally larger output datasets. This change has taken place in tandem with a gradual shift in the design and implementation of simulation and post-processing software, with a shift from simulation as a first step and visualisation/analysis as a second, towards in-situ on the fly methods that provide immediate visual feedback, place less strain on file-systems and reduce overall data-movement and copying. Concurrently, processor speed increases have dramatically slowed and multi and many-core architectures have instead become the norm for virtually all High Performance computing (HPC) machines. This in turn has led to a shift away from the traditional distributed one rank per node model, to one rank per process, using multiple processes per multicore node, and then back towards one rank per node again, using distributed and multi-threaded frameworks combined. This thesis consists of a series of publications that demonstrate how software design for analysis and visualisation has tracked these architectural changes and pushed the boundaries of HPC visualisation using dataflow techniques in distributed environments. The first publication shows how support for the time dimension in parallel pipelines can be implemented, demonstrating how information flow within an application can be leveraged to optimise performance and add features such as analysis of time-dependent flows and comparison of datasets at different timesteps. A method of integrating dataflow pipelines with in-situ visualisation is subsequently presented, using asynchronous coupling of user driven GUI controls and a live simulation running on a supercomputer. The loose coupling of analysis and simulation allows for reduced IO, immediate feedback and the ability to change simulation parameters on the fly. A significant drawback of parallel pipelines is the inefficiency caused by improper load-balancing, particularly during interactive analysis where the user may select between different features of interest, this problem is addressed in the fourth publication by integrating a high performance partitioning library into the visualization pipeline and extending the information flow up and down the pipeline to support it. This extension is demonstrated in the third publication (published earlier) on massive meshes with extremely high complexity and shows that general purpose visualization tools such as ParaView can be made to compete with bespoke software written for a dedicated task. The future of software running on many-core architectures will involve task-based runtimes, with dynamic load-balancing, asynchronous execution based on dataflow graphs, work stealing and concurrent data sharing between simulation and analysis. The final paper of this thesis presents an optimisation for one such runtime, in support of these future HPC applications

    An Application Perspective on High-Performance Computing and Communications

    Get PDF
    We review possible and probable industrial applications of HPCC focusing on the software and hardware issues. Thirty-three separate categories are illustrated by detailed descriptions of five areas -- computational chemistry; Monte Carlo methods from physics to economics; manufacturing; and computational fluid dynamics; command and control; or crisis management; and multimedia services to client computers and settop boxes. The hardware varies from tightly-coupled parallel supercomputers to heterogeneous distributed systems. The software models span HPF and data parallelism, to distributed information systems and object/data flow parallelism on the Web. We find that in each case, it is reasonably clear that HPCC works in principle, and postulate that this knowledge can be used in a new generation of software infrastructure based on the WebWindows approach, and discussed in an accompanying paper

    Support for flexible and transparent distributed computing

    Get PDF
    Modern distributed computing developed from the traditional supercomputing community rooted firmly in the culture of batch management. Therefore, the field has been dominated by queuing-based resource managers and work flow based job submission environments where static resource demands needed be determined and reserved prior to launching executions. This has made it difficult to support resource environments (e.g. Grid, Cloud) where the available resources as well as the resource requirements of applications may be both dynamic and unpredictable. This thesis introduces a flexible execution model where the compute capacity can be adapted to fit the needs of applications as they change during execution. Resource provision in this model is based on a fine-grained, self-service approach instead of the traditional one-time, system-level model. The thesis introduces a middleware based Application Agent (AA) that provides a platform for the applications to dynamically interact and negotiate resources with the underlying resource infrastructure. We also consider the issue of transparency, i.e., hiding the provision and management of the distributed environment. This is the key to attracting public to use the technology. The AA not only replaces user-controlled process of preparing and executing an application with a transparent software-controlled process, it also hides the complexity of selecting right resources to ensure execution QoS. This service is provided by an On-line Feedback-based Automatic Resource Configuration (OAC) mechanism cooperating with the flexible execution model. The AA constantly monitors utility-based feedbacks from the application during execution and thus is able to learn its behaviour and resource characteristics. This allows it to automatically compose the most efficient execution environment on the fly and satisfy any execution requirements defined by users. Two policies are introduced to supervise the information learning and resource tuning in the OAC. The Utility Classification policy classifies hosts according to their historical performance contributions to the application. According to this classification, the AA chooses high utility hosts and withdraws low utility hosts to configure an optimum environment. The Desired Processing Power Estimation (DPPE) policy dynamically configures the execution environment according to the estimated desired total processing power needed to satisfy users’ execution requirements. Through the introducing of flexibility and transparency, a user is able to run a dynamic/normal distributed application anywhere with optimised execution performance, without managing distributed resources. Based on the standalone model, the thesis further introduces a federated resource negotiation framework as a step forward towards an autonomous multi-user distributed computing world
    corecore