21 research outputs found

    Virtualizing the Stampede2 Supercomputer with Applications to HPC in the Cloud

    Full text link
    Methods developed at the Texas Advanced Computing Center (TACC) are described and demonstrated for automating the construction of an elastic, virtual cluster emulating the Stampede2 high performance computing (HPC) system. The cluster can be built and/or scaled in a matter of minutes on the Jetstream self-service cloud system and shares many properties of the original Stampede2, including: i) common identity management, ii) access to the same file systems, iii) equivalent software application stack and module system, iv) similar job scheduling interface via Slurm. We measure time-to-solution for a number of common scientific applications on our virtual cluster against equivalent runs on Stampede2 and develop an application profile where performance is similar or otherwise acceptable. For such applications, the virtual cluster provides an effective form of "cloud bursting" with the potential to significantly improve overall turnaround time, particularly when Stampede2 is experiencing long queue wait times. In addition, the virtual cluster can be used for test and debug without directly impacting Stampede2. We conclude with a discussion of how science gateways can leverage the TACC Jobs API web service to incorporate this cloud bursting technique transparently to the end user.Comment: 6 pages, 0 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, US

    Using Containers to Create More Interactive Online Training and Education Materials

    Full text link
    Containers are excellent hands-on learning environments for computing topics because they are customizable, portable, and reproducible. The Cornell University Center for Advanced Computing has developed the Cornell Virtual Workshop in high performance computing topics for many years, and we have always sought to make the materials as rich and interactive as possible. Toward the goal of building a more hands-on experimental learning experience directly into web-based online training environments, we developed the Cornell Container Runner Service, which allows online content developers to build container-based interactive edit and run commands directly into their web pages. Using containers along with CCRS has the potential to increase learner engagement and outcomes.Comment: 10 pages, 3 figures, PEARC '20 conference pape

    Distributed workflows with Jupyter

    Get PDF
    The designers of a new coordination interface enacting complex workflows have to tackle a dichotomy: choosing a language-independent or language-dependent approach. Language-independent approaches decouple workflow models from the host code's business logic and advocate portability. Language-dependent approaches foster flexibility and performance by adopting the same host language for business and coordination code. Jupyter Notebooks, with their capability to describe both imperative and declarative code in a unique format, allow taking the best of the two approaches, maintaining a clear separation between application and coordination layers but still providing a unified interface to both aspects. We advocate the Jupyter Notebooks’ potential to express complex distributed workflows, identifying the general requirements for a Jupyter-based Workflow Management System (WMS) and introducing a proof-of-concept portable implementation working on hybrid Cloud-HPC infrastructures. As a byproduct, we extended the vanilla IPython kernel with workflow-based parallel and distributed execution capabilities. The proposed Jupyter-workflow (Jw) system is evaluated on common scenarios for High Performance Computing (HPC) and Cloud, showing its potential in lowering the barriers between prototypical Notebooks and production-ready implementations

    Open Science via HUBzero: Exploring Five Science Gateways Supporting and Growing their Open Science Communities

    Get PDF
    The research landscape applying computational methods has become increasingly interdisciplinary and complex regarding the research computing ecosystem with novel hardware, software, data, and lab instruments. Reproducibility of research results, the usability of tools, and sharing of methods are all crucial for timely collaboration for research and teaching. HUBzero is a widely used science gateway framework designed to support online communities with efficient sharing and publication processes. The paper discusses the growth of communities for the five science gateways nanoHUB, MyGeoHub, QUBEShub & SCORE, CUE4CHNG, and HubICL using the HUBzero Platform to foster open science and tackling education with a diverse set of approaches and target communities. The presented methods and magnitude of the communities elucidate successful means for science gateways for fostering open science and open education

    A CyberGIS Integration and Computation Framework for High‐Resolution Continental‐Scale Flood Inundation Mapping

    Get PDF
    We present a Digital Elevation Model (DEM)-based hydrologic analysis methodology for continental flood inundation mapping (CFIM), implemented as a cyberGIS scientific workflow in which a 1/3rd arc-second (10m) Height Above Nearest Drainage (HAND) raster data for the conterminous U.S. (CONUS) was computed and employed for subsequent inundation mapping. A cyberGIS framework was developed to enable spatiotemporal integration and scalable computing of the entire inundation mapping process on a hybrid supercomputing architecture. The first 1/3rd arc-second CONUS HAND raster dataset was computed in 1.5 days on the CyberGIS ROGER supercomputer. The inundation mapping process developed in our exploratory study couples HAND with National Water Model (NWM) forecast data to enable near real-time inundation forecasts for CONUS. The computational performance of HAND and the inundation mapping process was profiled to gain insights into the computational characteristics in high-performance parallel computing scenarios. The establishment of the CFIM computational framework has broad and significant research implications that may lead to further development and improvement of flood inundation mapping methodologies

    Performance analysis and optimization of in-situ integration of simulation with data analysis: zipping applications up

    Get PDF
    This paper targets an important class of applications that requires combining HPC simulations with data analysis for online or real-time scientific discovery. We use the state-of-the-art parallel-IO and data-staging libraries to build simulation-time data analysis workflows, and conduct performance analysis with real-world applications of computational fluid dynamics (CFD) simulations and molecular dynamics (MD) simulations. Driven by in-depth performance inefficiency analysis, we design an end-to-end application-level approach to eliminating the interlocks and synchronizations existent in the present methods. Our new approach employs both task parallelism and pipeline parallelism to reduce synchronizations effectively. In addition, we design a fully asynchronous, fine-grain, and pipelining runtime system, which is named Zipper. Zipper is a multi-threaded distributed runtime system and executes in a layer below the simulation and analysis applications. To further reduce the simulation application's stall time and enhance the data transfer performance, we design a concurrent data transfer optimization that uses both HPC network and parallel file system for improved bandwidth. The scalability of the Zipper system has been verified by a performance model and various empirical large scale experiments. The experimental results on an Intel multicore cluster as well as a Knight Landing HPC system demonstrate that the Zipper based approach can outperform the fastest state-of-the-art I/O transport library by up to 220% using 13,056 processor cores

    UA12/2/1 College Heights Herald, Vol. 93, No. 34

    Get PDF
    WKU campus newspaper reporting campus, athletic and Bowling Green, Kentucky news. This issue contains articles: Kast, Monica & Andrew Henderson. Job Eliminations Trim $5 Million in Salaries from Budget Collins, Emma. Admission Requirement Could Change Diversity Hicks, Amelia. WKU Police Department Initiates an Emergency Alert System Hicks, Amelia. Community Members Invited to March About Firearms Eiler, Olivia. Nursing Students Exceed Pass Rate for NCLEX Aud, Shawna. Bring Your Focus Back to Healthy After Break. Vogler, Emily. Editorial Cartoon re: Teacher Pension Protest Pension Protest: Give Teachers a Seat at the Table Austin, Emma. iPhone Escapades: The One with the Armed Robbery Leonard, Nicole. Talking Politics Kast, Monica. Female Protagonists Rejoice – It’s Women’s History Month Fletcher, Griffin. Guthrie Bell Tower Now Fully Computerized Huey, Nic. Spring Break Recap Deppen, Laurel. Cape-Able – Brendan Ward, Red Towel Deppen, Laurel. WKU Alumni Start Online Fashion Empire Pink Lily Kizer, Drake. Former Wrestler Hillbilly Jim Morris to Enter World Wrestling Entertainment Hall of Fame Collins, Emma. WKU Student Activists Travel to Washington, DC McCarthy, Casey. Hilltoppers Struggle in Spring Break Competition – Softball Stahl, Matt. Missed Opportunities Lead to Series Loss Against Middle Tennessee State University Chisenhall, Jeremy. Questions to Be Answered During Spring Football Jessie, Alec. Ides of March – Basketball Porter, Sam. Hilltoppers Embrace Postseason Play in NIT – Basketbal

    Etude d’applications émergentes en HPC et leurs impacts sur des stratégies d’ordonnancement

    Get PDF
    With the expected convergence between HPC, BigData and AI, newapplications with different profiles are coming to HPC infrastructures.We aim at better understanding the features and needs of theseapplications in order to be able to run them efficiently on HPC platforms.The approach followed is bottom-up: we study thoroughly an emergingapplication, Spatially Localized Atlas Network (SLANT, originating from the neuroscience community) to understand its behavior.Based on these observations, we derive a generic, yet simple, application model (namely, a linear sequence of stochastic jobs). We expect this model to be representative for a large set of upcoming applicationsthat require the computational power of HPC clusters without fitting the typical behavior oflarge-scale traditional applications.In a second step, we show how one can manipulate this generic model in a scheduling framework. Specifically we consider the problem of making reservations (both time andmemory) for an execution on an HPC platform.We derive solutions using the model of the first step of this work.We experimentally show the robustness of the model, even with very few data or with another application, to generate themodel, and provide performance gainsLa convergence entre les domaines du calcul haute-performance, du BigData et de l'intelligence artificiellefait émerger de nouveaux profils d'application sur les infrastructures HPC.Dans ce travail, nous proposons une étude de ces nouvelles applications afin de mieux comprendre leurs caractériques et besoinsdans le but d'optimiser leur exécution sur des plateformes HPC.Pour ce faire, nous adoptons une démarche ascendante. Premièrement, nous étudions en détail une application émergente, SLANT, provenant du domaine des neurosciences. Par un profilage détaillé de l'application, nous exposons ses principales caractéristiques ainsi que ses besoins en terme de ressources de calcul.A partir de ces observations, nous proposons un modèle d'application générique, pour le moment simple, composé d'une séquence linéaire de tâches stochastiques. Ce modèle devrait, selon nous, être adapté à une grande variété de ces applications émergentes qui requièrent la puissance de calcul des clusters HPC sans présenter le comportement typique des applications qui s'exécutent sur des machines à grande-échelle.Deuxièmement, nous montrons comment utiliser le modèle d'application générique dans le cadre du développement de stratégies d'ordonnancement. Plus précisément, nous nous intéressons à la conception de stratégies de réservations (à la fois en terme de temps de calcul et de mémoire).Nous proposons de telles solutions utilisant le modèle d'application générique exprimé dans la première étape de ce travail.Enfin, nous montrons la robustesse du modèle d'application et de nos stratégies d'ordonnancement au travers d'évaluations expérimentales de nos stratégies.Notamment, nous démontrons que nos solutions surpassent les approches standards de la communauté des neurosciences, même en cas de donnéespartielles ou d'extension à d'autres applications que SLANT

    Workflow models for heterogeneous distributed systems

    Get PDF
    The role of data in modern scientific workflows becomes more and more crucial. The unprecedented amount of data available in the digital era, combined with the recent advancements in Machine Learning and High-Performance Computing (HPC), let computers surpass human performances in a wide range of fields, such as Computer Vision, Natural Language Processing and Bioinformatics. However, a solid data management strategy becomes crucial for key aspects like performance optimisation, privacy preservation and security. Most modern programming paradigms for Big Data analysis adhere to the principle of data locality: moving computation closer to the data to remove transfer-related overheads and risks. Still, there are scenarios in which it is worth, or even unavoidable, to transfer data between different steps of a complex workflow. The contribution of this dissertation is twofold. First, it defines a novel methodology for distributed modular applications, allowing topology-aware scheduling and data management while separating business logic, data dependencies, parallel patterns and execution environments. In addition, it introduces computational notebooks as a high-level and user-friendly interface to this new kind of workflow, aiming to flatten the learning curve and improve the adoption of such methodology. Each of these contributions is accompanied by a full-fledged, Open Source implementation, which has been used for evaluation purposes and allows the interested reader to experience the related methodology first-hand. The validity of the proposed approaches has been demonstrated on a total of five real scientific applications in the domains of Deep Learning, Bioinformatics and Molecular Dynamics Simulation, executing them on large-scale mixed cloud-High-Performance Computing (HPC) infrastructures
    corecore