537 research outputs found

    Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS

    Full text link
    GROMACS is a widely used package for biomolecular simulation, and over the last two decades it has evolved from small-scale efficiency to advanced heterogeneous acceleration and multi-level parallelism targeting some of the largest supercomputers in the world. Here, we describe some of the ways we have been able to realize this through the use of parallelization on all levels, combined with a constant focus on absolute performance. Release 4.6 of GROMACS uses SIMD acceleration on a wide range of architectures, GPU offloading acceleration, and both OpenMP and MPI parallelism within and between nodes, respectively. The recent work on acceleration made it necessary to revisit the fundamental algorithms of molecular simulation, including the concept of neighborsearching, and we discuss the present and future challenges we see for exascale simulation - in particular a very fine-grained task parallelism. We also discuss the software management, code peer review and continuous integration testing required for a project of this complexity.Comment: EASC 2014 conference proceedin

    Adaptation, deployment and evaluation of a railway simulator in cloud environments

    Get PDF
    Many scientific areas make extensive use of computer simulations to study realworld processes. As they become more complex and resource-intensive, traditional programming paradigms running on supercomputers have shown to be limited by their hardware resources. The Cloud and its elastic nature has been increasingly seen as a valid alternative for simulation execution, as it aims to provide virtually infinite resources, thus unlimited scalability. In order to bene t from this, simulators must be adapted to this paradigm since cloud migration tends to add virtualization and communication overhead. This work has the main objective of migrating a power consumption railway simulator to the Cloud, with minimal impact in the original code and preserving performance. We propose a data-centric adaptation based in MapReduce to distribute the simulation load across several nodes while minimising data transmission. We deployed our solution on an Amazon EC2 virtual cluster and measured its performance. We did the same in in our local cluster to compare the solution's performance against the original application when the Cloud's overhead is not present. Our tests show that the resulting application is highly scalable and shows a better overall performance regarding the original simulator in both environments. This document summarises the author's work during the whole adaptation development process .IngenierĂ­a InformĂĄtic

    Advanced Simulation and Computing FY12-13 Implementation Plan, Volume 2, Revision 0.5

    Full text link

    Distributed Computing in a Pandemic

    Get PDF
    The current COVID-19 global pandemic caused by the SARS-CoV-2 betacoronavirus has resulted in over a million deaths and is having a grave socio-economic impact, hence there is an urgency to find solutions to key research challenges. Much of this COVID-19 research depends on distributed computing. In this article, I review distributed architectures -- various types of clusters, grids and clouds -- that can be leveraged to perform these tasks at scale, at high-throughput, with a high degree of parallelism, and which can also be used to work collaboratively. High-performance computing (HPC) clusters will be used to carry out much of this work. Several bigdata processing tasks used in reducing the spread of SARS-CoV-2 require high-throughput approaches, and a variety of tools, which Hadoop and Spark offer, even using commodity hardware. Extremely large-scale COVID-19 research has also utilised some of the world's fastest supercomputers, such as IBM's SUMMIT -- for ensemble docking high-throughput screening against SARS-CoV-2 targets for drug-repurposing, and high-throughput gene analysis -- and Sentinel, an XPE-Cray based system used to explore natural products. Grid computing has facilitated the formation of the world's first Exascale grid computer. This has accelerated COVID-19 research in molecular dynamics simulations of SARS-CoV-2 spike protein interactions through massively-parallel computation and was performed with over 1 million volunteer computing devices using the Folding@home platform. Grids and clouds both can also be used for international collaboration by enabling access to important datasets and providing services that allow researchers to focus on research rather than on time-consuming data-management tasks

    Distributed Computing in a Pandemic: A Review of Technologies Available for Tackling COVID-19

    Full text link
    The current COVID-19 global pandemic caused by the SARS-CoV-2 betacoronavirus has resulted in over a million deaths and is having a grave socio-economic impact, hence there is an urgency to find solutions to key research challenges. Much of this COVID-19 research depends on distributed computing. In this article, I review distributed architectures -- various types of clusters, grids and clouds -- that can be leveraged to perform these tasks at scale, at high-throughput, with a high degree of parallelism, and which can also be used to work collaboratively. High-performance computing (HPC) clusters will be used to carry out much of this work. Several bigdata processing tasks used in reducing the spread of SARS-CoV-2 require high-throughput approaches, and a variety of tools, which Hadoop and Spark offer, even using commodity hardware. Extremely large-scale COVID-19 research has also utilised some of the world's fastest supercomputers, such as IBM's SUMMIT -- for ensemble docking high-throughput screening against SARS-CoV-2 targets for drug-repurposing, and high-throughput gene analysis -- and Sentinel, an XPE-Cray based system used to explore natural products. Grid computing has facilitated the formation of the world's first Exascale grid computer. This has accelerated COVID-19 research in molecular dynamics simulations of SARS-CoV-2 spike protein interactions through massively-parallel computation and was performed with over 1 million volunteer computing devices using the Folding@home platform. Grids and clouds both can also be used for international collaboration by enabling access to important datasets and providing services that allow researchers to focus on research rather than on time-consuming data-management tasks.Comment: 21 pages (15 excl. refs), 2 figures, 3 table

    RELEASE: A High-level Paradigm for Reliable Large-scale Server Software

    Get PDF
    Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the first six months. The project aim is to scale the Erlang’s radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the effectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene

    Parallel programming systems for scalable scientific computing

    Get PDF
    High-performance computing (HPC) systems are more powerful than ever before. However, this rise in performance brings with it greater complexity, presenting significant challenges for researchers who wish to use these systems for their scientific work. This dissertation explores the development of scalable programming solutions for scientific computing. These solutions aim to be effective across a diverse range of computing platforms, from personal desktops to advanced supercomputers.To better understand HPC systems, this dissertation begins with a literature review on exascale supercomputers, massive systems capable of performing 10Âč⁞ floating-point operations per second. This review combines both manual and data-driven analyses, revealing that while traditional challenges of exascale computing have largely been addressed, issues like software complexity and data volume remain. Additionally, the dissertation introduces the open-source software tool (called LitStudy) developed for this research.Next, this dissertation introduces two novel programming systems. The first system (called Rocket) is designed to scale all-versus-all algorithms to massive datasets. It features a multi-level software-based cache, a divide-and-conquer approach, hierarchical work-stealing, and asynchronous processing to maximize data reuse, exploit data locality, dynamically balance workloads, and optimize resource utilization. The second system (called Lightning) aims to scale existing single-GPU kernel functions across multiple GPUs, even on different nodes, with minimal code adjustments. Results across eight benchmarks on up to 32 GPUs show excellent scalability.The dissertation concludes by proposing a set of design principles for developing parallel programming systems for scalable scientific computing. These principles, based on lessons from this PhD research, represent significant steps forward in enabling researchers to efficiently utilize HPC systems

    Toward High-Performance Computing and Big Data Analytics Convergence: The Case of Spark-DIY

    Get PDF
    Convergence between high-performance computing (HPC) and big data analytics (BDA) is currently an established research area that has spawned new opportunities for unifying the platform layer and data abstractions in these ecosystems. This work presents an architectural model that enables the interoperability of established BDA and HPC execution models, reflecting the key design features that interest both the HPC and BDA communities, and including an abstract data collection and operational model that generates a unified interface for hybrid applications. This architecture can be implemented in different ways depending on the process- and data-centric platforms of choice and the mechanisms put in place to effectively meet the requirements of the architecture. The Spark-DIY platform is introduced in the paper as a prototype implementation of the architecture proposed. It preserves the interfaces and execution environment of the popular BDA platform Apache Spark, making it compatible with any Spark-based application and tool, while providing efficient communication and kernel execution via DIY, a powerful communication pattern library built on top of MPI. Later, Spark-DIY is analyzed in terms of performance by building a representative use case from the hydrogeology domain, EnKF-HGS. This application is a clear example of how current HPC simulations are evolving toward hybrid HPC-BDA applications, integrating HPC simulations within a BDA environment.This work was supported in part by the Spanish Ministry of Economy, Industry and Competitiveness under Grant TIN2016-79637-P(toward Unification of HPC and Big Data Paradigms), in part by the Spanish Ministry of Education under Grant FPU15/00422 TrainingProgram for Academic and Teaching Staff Grant, in part by the Advanced Scientific Computing Research, Office of Science, U.S.Department of Energy, under Contract DE-AC02-06CH11357, and in part by the DOE with under Agreement DE-DC000122495,Program Manager Laura Biven

    SPRINT: more runners, fewer hurdles

    Get PDF
    • 

    corecore