Search CORE

32 research outputs found

Towards Interactive, Reproducible Analytics at Scale on HPC Systems

Author: Cholia S,
Publication venue
Publication date: 25/03/2021
Field of study

Ezid

Recommended from our members

Interactive Supercomputing with Jupyter

Author: Cholia S
Thomas R
Publication venue: eScholarship, University of California
Publication date: 01/03/2021
Field of study

Rich user interfaces like Jupyter have the potential to make interacting with a supercomputer easier and more productive, consequently attracting new kinds of users and helping to expand the application of supercomputing to new science domains. For the scientist user, the ideal rich user interface delivers a familiar, responsive, introspective, modular, and customizable platform upon which to build, run, capture, document, re-run, and share analysis workflows. From the provider or system administrator perspective, such a platform would also be easy to configure, deploy securely, update, customize, and support. Jupyter checks most if not all of these boxes. But from the perspective of leadership computing organizations that provide supercomputing power to users, such a platform should also make the unique features of a supercomputer center more accessible to users and more composable with high-performance computing (HPC) workflows. Project Jupyter's (https://jupyter.org/about) core design philosophy of extensibility, abstraction, and agnostic deployment has allowed HPC centers like National Energy Research Scientific Computing Center to bring in advanced supercomputing capabilities that can extend the interactive notebook environment. This has enabled a rich scientific discovery platform, particularly for experimental facility data analysis and machine learning problems

eScholarship - University of California

Recommended from our members

Interactive Supercomputing with Jupyter

Author: Cholia S
Thomas R
Publication venue: eScholarship, University of California
Publication date: 02/02/2021
Field of study

eScholarship - University of California

The new “Gauge Connection” at NERSC

Author: Cholia S.
DeTar C.
Di Pierro M.
Hetrick James E.
Simone J.
Publication venue: Scholarly Commons
Publication date: 03/08/2013
Field of study

Pacific McGeorge School of Law

Scholarly Commons

Recommended from our members

The Materials Application Programming Interface (API): A simple, flexible and efficient API for materials data based on REpresentational State Transfer (REST) principles

Author: Brafman M
Ceder G
Cholia S
Gunter D
Jain A
Ong SP
Persson KA
Publication venue: eScholarship, University of California
Publication date: 01/02/2015
Field of study

In this paper, we describe the Materials Application Programming Interface (API), a simple, flexible and efficient interface to programmatically query and interact with the Materials Project database based on the REpresentational State Transfer (REST) pattern for the web. Since its creation in Aug 2012, the Materials API has been the Materials Project's de facto platform for data access, supporting not only the Materials Project's many collaborative efforts but also enabling new applications and analyses. We will highlight some of these analyses enabled by the Materials API, particularly those requiring consolidation of data on a large number of materials, such as data mining of structural and property trends, and generation of phase diagrams. We will conclude with a discussion of the role of the API in building a community that is developing novel applications and analyses based on Materials Project data

eScholarship - University of California

The Materials Application Programming Interface (API): A simple, flexible and efficient API for materials data based on REpresentational State Transfer (REST) principles

Author: Brafman M
Ceder G
Cholia S
Gunter D
Jain A
Ong SP
Persson KA
Publication venue: eScholarship, University of California
Publication date: 01/02/2015
Field of study

Crossref

eScholarship - University of California

Interactive Distributed Deep Learning with Jupyter Notebooks

Author: Bhimji W
Canon S
Cholia S
Evans O
Farrell S
Henderson M
Prabhat
Pérez F
Thomas R
Vose A
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

Deep learning researchers are increasingly using Jupyter notebooks to implement interactive, reproducible workflows with embedded visualization, steering and documentation. Such solutions are typically deployed on small-scale (e.g. single server) computing systems. However, as the sizes and complexities of datasets and associated neural network models increase, high-performance distributed systems become important for training and evaluating models in a feasible amount of time. In this paper we describe our vision for Jupyter notebook solutions to deploy deep learning workloads onto high-performance computing systems. We demonstrate the effectiveness of notebooks for distributed training and hyper-parameter optimization of deep neural networks with efficient, scalable backends

Crossref

eScholarship - University of California

External apical root resorption

Author: L. M. Carter
MN Gunraj
P. Hansrani
RF Ne
S. Ferrier
SS Cholia
WG Newman
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recommended from our members

A basis set for exploration of sensitivity to prescribed ocean conditions for estimating human contributions to extreme weather in CAM5.1-1degree

Author: Angélil OM
Cholia S
Collins WD
Keen N
Krishnan H
O'Brien TA
Risser MD
Stone DA
Wehner MF
Publication venue: eScholarship, University of California
Publication date: 01/03/2018
Field of study

This paper presents two contributions for research into better understanding the role of anthropogenic warming in extreme weather. The first contribution is the generation of a large number of multi-decadal simulations using a medium-resolution atmospheric climate model, CAM5.1-1degree, under two scenarios of historical climate following the protocols of the C20C+ Detection and Attribution project: the one we have experienced (All-Hist), and one that might have been experienced in the absence of human interference with the climate system (Nat-Hist). These simulations are specifically designed for understanding extreme weather and atmospheric variability in the context of anthropogenic climate change. The second contribution takes advantage of the duration and size of these simulations in order to identify features of variability in the prescribed ocean conditions that may strongly influence calculated estimates of the role of anthropogenic emissions on extreme weather frequency (event attribution). There is a large amount of uncertainty in how much anthropogenic emissions should warm regional ocean surface temperatures, yet contributions to the C20C+ Detection and Attribution project and similar efforts so far use only one or a limited number of possible estimates of the ocean warming attributable to anthropogenic emissions when generating their Nat-Hist simulations. Thus, the importance of the uncertainty in regional attributable warming estimates to the results of event attribution studies is poorly understood. The identification of features of the anomalous ocean state that seem to strongly influence event attribution estimates should therefore be able to serve as a basis set for effective sampling of other plausible attributable warming patterns. The identification performed in this paper examines monthly temperature and precipitation output from the CAM5.1-1degree simulations averaged over 237 land regions, and compares interannual anomalous variations in the ratio between the frequencies of extremes in the All-Hist and Nat-Hist simulations against variations in ocean temperatures

eScholarship - University of California

Recommended from our members

Towards Interactive, Reproducible Analytics at Scale on HPC Systems

Author: Bianchi L
Cholia S
Ghoshal D
Hays J
Heagy L
Henderson M
Paine D
Perez F
Ramakrishnan L
Publication venue: eScholarship, University of California
Publication date: 01/11/2020
Field of study

The growth in scientific data volumes has resulted in a need to scale up processing and analysis pipelines using High Performance Computing (HPC) systems. These workflows need interactive, reproducible analytics at scale. The Jupyter platform provides core capabilities for interactivity but was not designed for HPC systems. In this paper, we outline our efforts that bring together core technologies based on the Jupyter Platform to create interactive, reproducible analytics at scale on HPC systems. Our work is grounded in a real world science use case-applying geophysical simulations and inversions for imaging the subsurface. Our core platform addresses three key areas of the scientific analysis workflow-reproducibility, scalability, and interactivity. We describe our implemention of a system, using Binder, Science Capsule, and Dask software. We demonstrate the use of this software to run our use case and interactively visualize real-Time streams of HDF5 data

eScholarship - University of California