28 research outputs found
Capturing and viewing gigapixel images
We present a system to capture and view "Gigapixel images": very high resolution, high dynamic range, and wide angle imagery consisting of several billion pixels each. A specialized camera mount, in combination with an automated pipeline for alignment, exposure compensation, and stitching, provide the means to acquire Gigapixel images with a standard camera and lens. More importantly, our novel viewer enables exploration of such images at interactive rates over a network, while dynamically and smoothly interpolating the projection between perspective and curved projections, and simultaneously modifying the tone-mapping to ensure an optimal view of the portion of the scene being viewed.publishe
Capturing and viewing gigapixel images
We present a system to capture and view "Gigapixel images": very high resolution, high dynamic range, and wide angle imagery consisting of several billion pixels each. A specialized camera mount, in combination with an automated pipeline for alignment, exposure compensation, and stitching, provide the means to acquire Gigapixel images with a standard camera and lens. More importantly, our novel viewer enables exploration of such images at interactive rates over a network, while dynamically and smoothly interpolating the projection between perspective and curved projections, and simultaneously modifying the tone-mapping to ensure an optimal view of the portion of the scene being viewed.publishe
AdsorbML: A Leap in Efficiency for Adsorption Energy Calculations using Generalizable Machine Learning Potentials
Computational catalysis is playing an increasingly significant role in the
design of catalysts across a wide range of applications. A common task for many
computational methods is the need to accurately compute the adsorption energy
for an adsorbate and a catalyst surface of interest. Traditionally, the
identification of low energy adsorbate-surface configurations relies on
heuristic methods and researcher intuition. As the desire to perform
high-throughput screening increases, it becomes challenging to use heuristics
and intuition alone. In this paper, we demonstrate machine learning potentials
can be leveraged to identify low energy adsorbate-surface configurations more
accurately and efficiently. Our algorithm provides a spectrum of trade-offs
between accuracy and efficiency, with one balanced option finding the lowest
energy configuration 87.36% of the time, while achieving a 2000x speedup in
computation. To standardize benchmarking, we introduce the Open Catalyst Dense
dataset containing nearly 1,000 diverse surfaces and 100,000 unique
configurations.Comment: 26 pages, 7 figures. Submitted to npj Computational Material
Scale-MAE: A Scale-Aware Masked Autoencoder for Multiscale Geospatial Representation Learning
Large, pretrained models are commonly finetuned with imagery that is heavily
augmented to mimic different conditions and scales, with the resulting models
used for various tasks with imagery from a range of spatial scales. Such models
overlook scale-specific information in the data for scale-dependent domains,
such as remote sensing. In this paper, we present Scale-MAE, a pretraining
method that explicitly learns relationships between data at different, known
scales throughout the pretraining process. Scale-MAE pretrains a network by
masking an input image at a known input scale, where the area of the Earth
covered by the image determines the scale of the ViT positional encoding, not
the image resolution. Scale-MAE encodes the masked image with a standard ViT
backbone, and then decodes the masked image through a bandpass filter to
reconstruct low/high frequency images at lower/higher scales. We find that
tasking the network with reconstructing both low/high frequency images leads to
robust multiscale representations for remote sensing imagery. Scale-MAE
achieves an average of a non-parametric kNN classification
improvement across eight remote sensing datasets compared to current
state-of-the-art and obtains a mIoU to mIoU improvement on the
SpaceNet building segmentation transfer task for a range of evaluation scales
The Open DAC 2023 Dataset and Challenges for Sorbent Discovery in Direct Air Capture
New methods for carbon dioxide removal are urgently needed to combat global
climate change. Direct air capture (DAC) is an emerging technology to capture
carbon dioxide directly from ambient air. Metal-organic frameworks (MOFs) have
been widely studied as potentially customizable adsorbents for DAC. However,
discovering promising MOF sorbents for DAC is challenging because of the vast
chemical space to explore and the need to understand materials as functions of
humidity and temperature. We explore a computational approach benefiting from
recent innovations in machine learning (ML) and present a dataset named Open
DAC 2023 (ODAC23) consisting of more than 38M density functional theory (DFT)
calculations on more than 8,400 MOF materials containing adsorbed and/or
. ODAC23 is by far the largest dataset of MOF adsorption calculations at
the DFT level of accuracy currently available. In addition to probing
properties of adsorbed molecules, the dataset is a rich source of information
on structural relaxation of MOFs, which will be useful in many contexts beyond
specific applications for DAC. A large number of MOFs with promising properties
for DAC are identified directly in ODAC23. We also trained state-of-the-art ML
models on this dataset to approximate calculations at the DFT level. This
open-source dataset and our initial ML models will provide an important
baseline for future efforts to identify MOFs for a wide range of applications,
including DAC
A Practical Stereo Depth System for Smart Glasses
We present the design of a productionized end-to-end stereo depth sensing
system that does pre-processing, online stereo rectification, and stereo depth
estimation with a fallback to monocular depth estimation when rectification is
unreliable. The output of our depth sensing system is then used in a novel view
generation pipeline to create 3D computational photography effects using
point-of-view images captured by smart glasses. All these steps are executed
on-device on the stringent compute budget of a mobile phone, and because we
expect the users can use a wide range of smartphones, our design needs to be
general and cannot be dependent on a particular hardware or ML accelerator such
as a smartphone GPU. Although each of these steps is well studied, a
description of a practical system is still lacking. For such a system, all
these steps need to work in tandem with one another and fallback gracefully on
failures within the system or less than ideal input data. We show how we handle
unforeseen changes to calibration, e.g., due to heat, robustly support depth
estimation in the wild, and still abide by the memory and latency constraints
required for a smooth user experience. We show that our trained models are
fast, and run in less than 1s on a six-year-old Samsung Galaxy S8 phone's CPU.
Our models generalize well to unseen data and achieve good results on
Middlebury and in-the-wild images captured from the smart glasses.Comment: Accepted at CVPR202
Capturing and Viewing Gigapixel Images
We present a system to capture and view âGigapixel imagesâ: very high resolution, high dynamic range, and wide angle imagery consisting of several billion pixels each. A specialized camera mount, in combination with an automated pipeline for alignment, exposure compensation, and stitching, provide the means to acquire Gigapixel images with a standard camera and lens. More importantly, our novel viewer enables exploration of such images at interactive rates over a network, while dynamically and smoothly interpolating the projection between perspective and curved projections, and simultaneously modifying the tone-mapping to ensure an optimal view of the portion of the scene being viewed.