2 research outputs found
Cyberinfrastructure Deployments on Public Research Clouds Enable Accessible Environmental Data Science Education
Modern science depends on computers, but not all scientists have access to the scale of computation they need. A digital divide separates scientists who accelerate their science using large cyberinfrastructure from those who do not, or who do not have access to the compute resources or learning opportunities to develop the skills needed. The exclusionary nature of the digital divide threatens equity and the future of innovation by leaving people out of the scientific process while over-amplifying the voices of a small group who have resources. However, there are potential solutions: recent advancements in public research cyberinfrastructure and resources developed during the open science revolution are providing tools that can help bridge this divide. These tools can enable access to fast and powerful computation with modest internet connections and personal computers. Here we contribute another resource for narrowing the digital divide: scalable virtual machines running on public cloud infrastructure. We describe the tools, infrastructure, and methods that enabled successful deployment of a reproducible and scalable cyberinfrastructure architecture for a collaborative data synthesis working group in February 2023. This platform enabled 45 scientists with varying data and compute skills to leverage 40,000 hours of compute time over a 4-day workshop. Our approach provides an open framework that can be replicated for educational and collaborative data synthesis experiences in any data- and compute-intensive discipline
Recommended from our members
FIRED POLAND
This is event-level polygons for the fire event delineation (FIRED) product for POLAND from November 2000 to January 2024. It is derived from the MODIS MCD64A1 burned area product (see https://lpdaac.usgs.gov/products/mcd64a1v061/ for more details). The MCD64A1 is a monthly raster grid of estimated burned dates. Firedpy (https://github.com/earthlab/firedpy) is an algorithm that converts these rasters into events by stacking the entire time series into a spatial-temporal data cube, then uses an algorithm to assign event identification numbers to pixels that fit into the same 3-dimensional spatial temporal window. This particular dataset was created using a spatial parameter of 1 pixels and 5 days. See the associated paper for more details on the methods and more:Balch, J.K.; St. Denis, L.A.; Mahood, A.L.; Mietkiewicz, N.P.; Williams, T.M.; McGlinchy, J.; Cook, M.C. FIRED (Fire Events Delineation): An Open, Flexible Algorithm and Database of US Fire Events Derived from the MODIS Burned Area Product (2001–2019). Remote Sens. 2020, 12, 3498. https://doi.org/10.3390/rs12213498</p