98 research outputs found

    Multi-Node Advanced Performance and Power Analysis with Paraver

    Get PDF
    Performance analysis tools allow application developers to identify and characterize the inefficiencies that cause performance degradation in their codes. Due to the increasing interest in the High Performance Computing (HPC) community towards energy-efficiency issues, it is of paramount importance to be able to correlate performance and power figures within the same profiling and analysis tools. For this reason, we present a preliminary performance and energy-efficiency study aimed at demonstrating how a single tool can be used to collect most of the relevant metrics. Moreover we show how the same analysis techniques are applicable on different architectures, analyzing the same HPC application running on two clusters, based respectively on Intel Haswell and Arm Cortex-A57 CPUs.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] and Horizon 2020 under the Mont-Blanc projects, grant agreements n. 288777, 610402 and 671697. E.C. was partially founded by “Contributo 5 per mille assegnato all’Universit`a degli Studi di Ferrara - dichiarazione dei redditi dell’anno 2014”.Peer ReviewedPostprint (author's final draft

    Advanced Simulation and Computing FY09-FY10 Implementation Plan Volume 2, Rev. 1

    Full text link

    Performance and Power Analysis of HPC Workloads on Heterogenous Multi-Node Clusters

    Get PDF
    Performance analysis tools allow application developers to identify and characterize the inefficiencies that cause performance degradation in their codes, allowing for application optimizations. Due to the increasing interest in the High Performance Computing (HPC) community towards energy-efficiency issues, it is of paramount importance to be able to correlate performance and power figures within the same profiling and analysis tools. For this reason, we present a performance and energy-efficiency study aimed at demonstrating how a single tool can be used to collect most of the relevant metrics. In particular, we show how the same analysis techniques can be applicable on different architectures, analyzing the same HPC application on a high-end and a low-power cluster. The former cluster embeds Intel Haswell CPUs and NVIDIA K80 GPUs, while the latter is made up of NVIDIA Jetson TX1 boards, each hosting an Arm Cortex-A57 CPU and an NVIDIA Tegra X1 Maxwell GPU.The research leading to these results has received funding from the European Community’s Seventh Framework Programme [FP7/2007-2013] and Horizon 2020 under the Mont-Blanc projects [17], grant agreements n. 288777, 610402 and 671697. E.C. was partially founded by “Contributo 5 per mille assegnato all’Università degli Studi di Ferrara-dichiarazione dei redditi dell’anno 2014”. We thank the University of Ferrara and INFN Ferrara for the access to the COKA Cluster. We warmly thank the BSC tools group, supporting us for the smooth integration and test of our setup within Extrae and Paraver.Peer ReviewedPostprint (published version

    CFD Vision 2030 Study: A Path to Revolutionary Computational Aerosciences

    Get PDF
    This report documents the results of a study to address the long range, strategic planning required by NASA's Revolutionary Computational Aerosciences (RCA) program in the area of computational fluid dynamics (CFD), including future software and hardware requirements for High Performance Computing (HPC). Specifically, the "Vision 2030" CFD study is to provide a knowledge-based forecast of the future computational capabilities required for turbulent, transitional, and reacting flow simulations across a broad Mach number regime, and to lay the foundation for the development of a future framework and/or environment where physics-based, accurate predictions of complex turbulent flows, including flow separation, can be accomplished routinely and efficiently in cooperation with other physics-based simulations to enable multi-physics analysis and design. Specific technical requirements from the aerospace industrial and scientific communities were obtained to determine critical capability gaps, anticipated technical challenges, and impediments to achieving the target CFD capability in 2030. A preliminary development plan and roadmap were created to help focus investments in technology development to help achieve the CFD vision in 2030

    High Performance Computing Facility Operational Assessment, FY 2011 Oak Ridge Leadership Computing Facility

    Full text link

    Advanced Simulation and Computing FY10-FY11 Implementation Plan Volume 2, Rev. 0.5

    Full text link

    NAMD: biomolecular simulation on thousands of processors

    Get PDF
    Abstract NAMD is a fully featured, production molecular dynamics program for high performance simulation of large biomolecular systems. We have previously, at SC2000, presented scaling results for simulations with cutoff electrostatics on up to 2048 processors of the ASCI Red machine, achieved with an object-based hybrid force and spatial decomposition scheme and an aggressive measurement-based predictive load balancing framework. We extend this work by demonstrating similar scaling on the much faster processors of the PSC Lemieux Alpha cluster, and for simulations employing efficient (order N log N) particle mesh Ewald full electrostatics. This unprecedented scalability in a biomolecular simulation code has been attained through latency tolerance, adaptation to multiprocessor nodes, and the direct use of the Quadrics Elan library in place of MPI by the Charm++/Converse parallel runtime system

    Acceleration of Biomolecular Simulations using FPGA-based Reconfigurable Computing

    Get PDF
    A paradigm shift is occurring in the way compute-intensive scientific applications are developed. Thanks to advancements in commercially viable hybrid architectures for High-Performance Computing (HPC), the focus has shifted from improving performance by merely scaling algorithms on von Neumann computing nodes to fully exploiting additional computational capabilities provided by accelerators such as FPGAs (Field Programmable Gate Arrays) and GPGPUs (General Purpose Graphical Processing Units). Computational chemists use Molecular Dynamics (MD) simulations like LAMMPS (Large Scale Atomic Molecular Massively Parallel Systems) and NAMD (NAnoscale Molecular Dynamics) to simulate biomolecular behaviour such as protein folding and small molecule docking to proteins. MD simulations are computationally complex n-body problems, which are time consuming to simulate in biologically relevant scales. Executing such simulations in best available HPC environments is critical for scientific advancements in the field. Thus, as HPC technology evolves, there is a need to update classical biomolecular simulation applications like LAMMPS to better suit the architecture. In this work, we modify LAMMPS (a classical molecular dynamics simulation program developed for CPU-only clusters) to execute on a reconfigurable computer system, SRC-7 H MAP. The SRC-7 H MAP consists of two Altera FPGA logic chips interfaced to a dual-core Intel Xeon processor. Users can benefit by offloading most compute-intensive tasks of the application to the FPGA logic. This work explores the challenges involved in effectively adapting a production level application code optimized for von Neumann architecture, to an FPGA-based hybrid architecture. We have successfully accelerated the non-bonded force computations, the most compute-intensive module in LAMMPS for biomolecular simulations, by 5.0x over a single 3.0 GHz Xeon processor. This performance includes the data transfer overheads and function calling overheads. Further, using the accelerated non-bonded force computations function, we achieve an overall application speed-up of 2.0x to 2.4

    The Argonne Leadership Computing Facility 2010 annual report.

    Full text link

    Simulation of the UKQCD computer

    Get PDF
    • 

    corecore