803 research outputs found

    Virtualizing the Stampede2 Supercomputer with Applications to HPC in the Cloud

    Full text link
    Methods developed at the Texas Advanced Computing Center (TACC) are described and demonstrated for automating the construction of an elastic, virtual cluster emulating the Stampede2 high performance computing (HPC) system. The cluster can be built and/or scaled in a matter of minutes on the Jetstream self-service cloud system and shares many properties of the original Stampede2, including: i) common identity management, ii) access to the same file systems, iii) equivalent software application stack and module system, iv) similar job scheduling interface via Slurm. We measure time-to-solution for a number of common scientific applications on our virtual cluster against equivalent runs on Stampede2 and develop an application profile where performance is similar or otherwise acceptable. For such applications, the virtual cluster provides an effective form of "cloud bursting" with the potential to significantly improve overall turnaround time, particularly when Stampede2 is experiencing long queue wait times. In addition, the virtual cluster can be used for test and debug without directly impacting Stampede2. We conclude with a discussion of how science gateways can leverage the TACC Jobs API web service to incorporate this cloud bursting technique transparently to the end user.Comment: 6 pages, 0 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, US

    ASCR/HEP Exascale Requirements Review Report

    Full text link
    This draft report summarizes and details the findings, results, and recommendations derived from the ASCR/HEP Exascale Requirements Review meeting held in June, 2015. The main conclusions are as follows. 1) Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude -- and in some cases greater -- than that available currently. 2) The growth rate of data produced by simulations is overwhelming the current ability, of both facilities and researchers, to store and analyze it. Additional resources and new techniques for data analysis are urgently needed. 3) Data rates and volumes from HEP experimental facilities are also straining the ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. 4) A close integration of HPC simulation and data analysis will aid greatly in interpreting results from HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. 5) Long-range planning between HEP and ASCR will be required to meet HEP's research needs. To best use ASCR HPC resources the experimental HEP program needs a) an established long-term plan for access to ASCR computational and data resources, b) an ability to map workflows onto HPC resources, c) the ability for ASCR facilities to accommodate workflows run by collaborations that can have thousands of individual members, d) to transition codes to the next-generation HPC platforms that will be available at ASCR facilities, e) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.Comment: 77 pages, 13 Figures; draft report, subject to further revisio

    The Fifth Workshop on HPC Best Practices: File Systems and Archives

    Full text link
    The workshop on High Performance Computing (HPC) Best Practices on File Systems and Archives was the fifth in a series sponsored jointly by the Department Of Energy (DOE) Office of Science and DOE National Nuclear Security Administration. The workshop gathered technical and management experts for operations of HPC file systems and archives from around the world. Attendees identified and discussed best practices in use at their facilities, and documented findings for the DOE and HPC community in this report

    Red Storm usage model :Version 1.12.

    Full text link

    Advanced Simulation and Computing FY12-13 Implementation Plan, Volume 2, Revision 0.5

    Full text link
    corecore