160 research outputs found

    Deploying Jupyter Notebooks at scale on XSEDE resources for Science Gateways and workshops

    Full text link
    Jupyter Notebooks have become a mainstream tool for interactive computing in every field of science. Jupyter Notebooks are suitable as companion applications for Science Gateways, providing more flexibility and post-processing capability to the users. Moreover they are often used in training events and workshops to provide immediate access to a pre-configured interactive computing environment. The Jupyter team released the JupyterHub web application to provide a platform where multiple users can login and access a Jupyter Notebook environment. When the number of users and memory requirements are low, it is easy to setup JupyterHub on a single server. However, setup becomes more complicated when we need to serve Jupyter Notebooks at scale to tens or hundreds of users. In this paper we will present three strategies for deploying JupyterHub at scale on XSEDE resources. All options share the deployment of JupyterHub on a Virtual Machine on XSEDE Jetstream. In the first scenario, JupyterHub connects to a supercomputer and launches a single node job on behalf of each user and proxies back the Notebook from the computing node back to the user's browser. In the second scenario, implemented in the context of a XSEDE consultation for the IRIS consortium for Seismology, we deploy Docker in Swarm mode to coordinate many XSEDE Jetstream virtual machines to provide Notebooks with persistent storage and quota. In the last scenario we install the Kubernetes containers orchestration framework on Jetstream to provide a fault-tolerant JupyterHub deployment with a distributed filesystem and capability to scale to thousands of users. In the conclusion section we provide a link to step-by-step tutorials complete with all the necessary commands and configuration files to replicate these deployments.Comment: 7 pages, 3 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, US

    Enabling Interactive Analytics of Secure Data using Cloud Kotta

    Full text link
    Research, especially in the social sciences and humanities, is increasingly reliant on the application of data science methods to analyze large amounts of (often private) data. Secure data enclaves provide a solution for managing and analyzing private data. However, such enclaves do not readily support discovery science---a form of exploratory or interactive analysis by which researchers execute a range of (sometimes large) analyses in an iterative and collaborative manner. The batch computing model offered by many data enclaves is well suited to executing large compute tasks; however it is far from ideal for day-to-day discovery science. As researchers must submit jobs to queues and wait for results, the high latencies inherent in queue-based, batch computing systems hinder interactive analysis. In this paper we describe how we have augmented the Cloud Kotta secure data enclave to support collaborative and interactive analysis of sensitive data. Our model uses Jupyter notebooks as a flexible analysis environment and Python language constructs to support the execution of arbitrary functions on private data within this secure framework.Comment: To appear in Proceedings of Workshop on Scientific Cloud Computing, Washington, DC USA, June 2017 (ScienceCloud 2017), 7 page

    Capturing the "Whole Tale" of Computational Research: Reproducibility in Computing Environments

    Full text link
    We present an overview of the recently funded "Merging Science and Cyberinfrastructure Pathways: The Whole Tale" project (NSF award #1541450). Our approach has two nested goals: 1) deliver an environment that enables researchers to create a complete narrative of the research process including exposure of the data-to-publication lifecycle, and 2) systematically and persistently link research publications to their associated digital scholarly objects such as the data, code, and workflows. To enable this, Whole Tale will create an environment where researchers can collaborate on data, workspaces, and workflows and then publish them for future adoption or modification. Published data and applications will be consumed either directly by users using the Whole Tale environment or can be integrated into existing or future domain Science Gateways

    Workshop Report: Container Based Analysis Environments for Research Data Access and Computing

    Get PDF
    Report of the first workshop on Container Based Analysis Environments for Research Data Access and Computing supported by the National Data Service and Data Exploration Lab and held at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign

    Towards Computational Notebooks for IoT Development

    Get PDF
    Internet of Things systems are complex to develop. They are required to exhibit various features and run across several environments. Software developers have to deal with this heterogeneity both when configuring the development and execution environments and when writing the code. Meanwhile, computational notebooks have been gaining prominence due to their capability to consolidate text, executable code, and visualizations in a single document. Although they are mainly used in the field of data science, the characteristics of such notebooks could make them suitable to support the development of IoT systems as well. This work proposes an IoT-tailored literate computing approach in the form of a computational notebook. We present a use case of a typical IoT system involving several interconnected components and describe the implementation of a computational notebook as a tool to support its development. Finally, we point out the opportunities and limitations of this approach

    Software Engineering in the IoT Context: Characteristics, Challenges, and Enabling Strategies

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Streamlined HPC Environments with CVMFS and CyberGIS-Compute

    Get PDF
    High-Performance Computing (HPC) resources provide the potential for complex, large-scale modeling and analysis, fueling scientific progress over the last few decades, but these advances are not equally distributed across disciplines. Those in computational disciplines are often trained to have the necessary technical skills to utilize HPC (e.g. familiarity with the terminal), but many disciplines face technical hurdles when trying to apply HPC resources to their work. This unequal familiarity with HPC is increasingly a problem as cross-discipline teams work to tackle critical interdisciplinary issues like climate change and sustainability. CyberGIS-Compute is middle-ware designed to democratize to HPC services with the goal of empowering domain scientists, but a key challenge facing model developers on CyberGIS-Compute is creating a containerized software environment for their models. In this paper, we discuss our work to integrate the Cern Virtual Machine File System (CVMFS) into CyberGIS-Compute to provide consistent software environments across science gateways and HPC resources
    • …
    corecore