Search CORE

160 research outputs found

Deploying Jupyter Notebooks at scale on XSEDE resources for Science Gateways and workshops

Author: Jette Morris A.
Kluyver Thomas
Weil Sage A
Wilkins-Diehr Nancy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 25/07/2018
Field of study

Jupyter Notebooks have become a mainstream tool for interactive computing in every field of science. Jupyter Notebooks are suitable as companion applications for Science Gateways, providing more flexibility and post-processing capability to the users. Moreover they are often used in training events and workshops to provide immediate access to a pre-configured interactive computing environment. The Jupyter team released the JupyterHub web application to provide a platform where multiple users can login and access a Jupyter Notebook environment. When the number of users and memory requirements are low, it is easy to setup JupyterHub on a single server. However, setup becomes more complicated when we need to serve Jupyter Notebooks at scale to tens or hundreds of users. In this paper we will present three strategies for deploying JupyterHub at scale on XSEDE resources. All options share the deployment of JupyterHub on a Virtual Machine on XSEDE Jetstream. In the first scenario, JupyterHub connects to a supercomputer and launches a single node job on behalf of each user and proxies back the Notebook from the computing node back to the user's browser. In the second scenario, implemented in the context of a XSEDE consultation for the IRIS consortium for Seismology, we deploy Docker in Swarm mode to coordinate many XSEDE Jetstream virtual machines to provide Notebooks with persistent storage and quota. In the last scenario we install the Kubernetes containers orchestration framework on Jetstream to provide a fault-tolerant JupyterHub deployment with a distributed filesystem and capability to scale to thousands of users. In the conclusion section we provide a link to step-by-step tutorials complete with all the necessary commands and configuration files to replicate these deployments.Comment: 7 pages, 3 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, US

arXiv.org e-Print Archive

Crossref

Enabling Interactive Analytics of Secure Data using Cloud Kotta

Author: Babuji Yadu N.
Chard Kyle
Duede Eamon
Publication venue
Publication date: 28/04/2017
Field of study

Research, especially in the social sciences and humanities, is increasingly reliant on the application of data science methods to analyze large amounts of (often private) data. Secure data enclaves provide a solution for managing and analyzing private data. However, such enclaves do not readily support discovery science---a form of exploratory or interactive analysis by which researchers execute a range of (sometimes large) analyses in an iterative and collaborative manner. The batch computing model offered by many data enclaves is well suited to executing large compute tasks; however it is far from ideal for day-to-day discovery science. As researchers must submit jobs to queues and wait for results, the high latencies inherent in queue-based, batch computing systems hinder interactive analysis. In this paper we describe how we have augmented the Cloud Kotta secure data enclave to support collaborative and interactive analysis of sensitive data. Our model uses Jupyter notebooks as a flexible analysis environment and Python language constructs to support the execution of arbitrary functions on private data within this secure framework.Comment: To appear in Proceedings of Workshop on Scientific Cloud Computing, Washington, DC USA, June 2017 (ScienceCloud 2017), 7 page

arXiv.org e-Print Archive

Crossref

Capturing the "Whole Tale" of Computational Research: Reproducibility in Computing Environments

Author: Chard Kyle
Gaffney Niall
Jones Matthew B.
Ludaescher Bertram
Nabrzyski Jaroslaw
Stodden Victoria
Turk Matthew
Publication venue
Publication date: 28/10/2016
Field of study

We present an overview of the recently funded "Merging Science and Cyberinfrastructure Pathways: The Whole Tale" project (NSF award #1541450). Our approach has two nested goals: 1) deliver an environment that enables researchers to create a complete narrative of the research process including exposure of the data-to-publication lifecycle, and 2) systematically and persistently link research publications to their associated digital scholarly objects such as the data, code, and workflows. To enable this, Whole Tale will create an environment where researchers can collaborate on data, workspaces, and workflows and then publish them for future adoption or modification. Published data and applications will be consumed either directly by users using the Whole Tale environment or can be integrated into existing or future domain Science Gateways

arXiv.org e-Print Archive

FigShare

Workshop Report: Container Based Analysis Environments for Research Data Access and Computing

Author: Bauer Greg
Burnette Max
Carrasco Kind Matias
Fonner John
Haas Roland
Huerta Eliu
Kim Jai Won
Kowalik Kapcer
LeBauer David
Lemson Gerard
Liu Yan
McEwen Ian
Tao Jian
Terstriep Jeff
Turk Matt
Willis Craig
Zonca Andrea
Publication venue
Publication date
Field of study

Report of the first workshop on Container Based Analysis Environments for Research Data Access and Computing supported by the National Data Service and Data Exploration Lab and held at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign

ZENODO

Scaling Notebooks as Re-configurable Cloud Workflows

Author: Bianchi R.
Kissling W.D.
Koulouzis S.
Li N.
Shi Y.
Timmermans J.
Wang Y.
Zhao Z.
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2022
Field of study

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Scaling Notebooks as Re-configurable Cloud Workflows

Author: Bianchi R.
Kissling W.D.
Koulouzis S.
Li N.
Shi Y.
Timmermans J.
Wang Y.
Zhao Z.
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2022
Field of study

International Migration, Integration and Social Cohesion online publications

Towards Computational Notebooks for IoT Development

Author: Corno F.
De Russis L
Saenz J. P.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Internet of Things systems are complex to develop. They are required to exhibit various features and run across several environments. Software developers have to deal with this heterogeneity both when configuring the development and execution environments and when writing the code. Meanwhile, computational notebooks have been gaining prominence due to their capability to consolidate text, executable code, and visualizations in a single document. Although they are mainly used in the field of data science, the characteristics of such notebooks could make them suitable to support the development of IoT systems as well. This work proposes an IoT-tailored literate computing approach in the form of a computational notebook. We present a use case of a typical IoT system involving several interconnected components and describe the implementation of a computational notebook as a tool to support its development. Finally, we point out the opportunities and limitations of this approach

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Software Engineering in the IoT Context: Characteristics, Challenges, and Enabling Strategies

Author: SAENZ MORENO JUAN PABLO
Publication venue: country:Italy
Publication date: 15/07/2020
Field of study

L'abstract è presente nell'allegato / the abstract is in the attachmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Streamlined HPC Environments with CVMFS and CyberGIS-Compute

Author: Kotak Mit
Michels Alexander C
Padmanabhan Anand
Wang Shaowen
Publication venue: 'Purdue University (bepress)'
Publication date: 06/10/2023
Field of study

High-Performance Computing (HPC) resources provide the potential for complex, large-scale modeling and analysis, fueling scientific progress over the last few decades, but these advances are not equally distributed across disciplines. Those in computational disciplines are often trained to have the necessary technical skills to utilize HPC (e.g. familiarity with the terminal), but many disciplines face technical hurdles when trying to apply HPC resources to their work. This unequal familiarity with HPC is increasingly a problem as cross-discipline teams work to tackle critical interdisciplinary issues like climate change and sustainability. CyberGIS-Compute is middle-ware designed to democratize to HPC services with the goal of empowering domain scientists, but a key challenge facing model developers on CyberGIS-Compute is creating a containerized software environment for their models. In this paper, we discuss our work to integrate the Cern Virtual Machine File System (CVMFS) into CyberGIS-Compute to provide consistent software environments across science gateways and HPC resources

Purdue E-Pubs