103 research outputs found
AmbientFlow: Invertible generative models from incomplete, noisy measurements
Generative models have gained popularity for their potential applications in
imaging science, such as image reconstruction, posterior sampling and data
sharing. Flow-based generative models are particularly attractive due to their
ability to tractably provide exact density estimates along with fast,
inexpensive and diverse samples. Training such models, however, requires a
large, high quality dataset of objects. In applications such as computed
imaging, it is often difficult to acquire such data due to requirements such as
long acquisition time or high radiation dose, while acquiring noisy or
partially observed measurements of these objects is more feasible. In this
work, we propose AmbientFlow, a framework for learning flow-based generative
models directly from noisy and incomplete data. Using variational Bayesian
methods, a novel framework for establishing flow-based generative models from
noisy, incomplete data is proposed. Extensive numerical studies demonstrate the
effectiveness of AmbientFlow in correctly learning the object distribution. The
utility of AmbientFlow in a downstream inference task of image reconstruction
is demonstrated
SSL4EO-L: Datasets and Foundation Models for Landsat Imagery
The Landsat program is the longest-running Earth observation program in
history, with 50+ years of data acquisition by 8 satellites. The multispectral
imagery captured by sensors onboard these satellites is critical for a wide
range of scientific fields. Despite the increasing popularity of deep learning
and remote sensing, the majority of researchers still use decision trees and
random forests for Landsat image analysis due to the prevalence of small
labeled datasets and lack of foundation models. In this paper, we introduce
SSL4EO-L, the first ever dataset designed for Self-Supervised Learning for
Earth Observation for the Landsat family of satellites (including 3 sensors and
2 product levels) and the largest Landsat dataset in history (5M image
patches). Additionally, we modernize and re-release the L7 Irish and L8 Biome
cloud detection datasets, and introduce the first ML benchmark datasets for
Landsats 4-5 TM and Landsat 7 ETM+ SR. Finally, we pre-train the first
foundation models for Landsat imagery using SSL4EO-L and evaluate their
performance on multiple semantic segmentation tasks. All datasets and model
weights are available via the TorchGeo (https://github.com/microsoft/torchgeo)
library, making reproducibility and experimentation easy, and enabling
scientific advancements in the burgeoning field of remote sensing for a
multitude of downstream applications
Climate Informatics
The impacts of present and potential future climate change will be one of the most important scientific and societal challenges in the 21st century. Given observed changes in temperature, sea ice, and sea level, improving our understanding of the climate system is an international priority. This system is characterized by complex phenomena that are imperfectly observed and even more imperfectly simulated. But with an ever-growing supply of climate data from satellites and environmental sensors, the magnitude of data and climate model output is beginning to overwhelm the relatively simple tools currently used to analyze them. A computational approach will therefore be indispensable for these analysis challenges. This chapter introduces the fledgling research discipline climate informatics: collaborations between climate scientists and machine learning researchers in order to bridge this gap between data and understanding. We hope that the study of climate informatics will accelerate discovery in answering pressing questions in climate science
ARTEMIN Promotes De Novo Angiogenesis in ER Negative Mammary Carcinoma through Activation of TWIST1-VEGF-A Signalling
10.1371/journal.pone.0050098PLoS ONE711
Intelligent Systems for Geosciences: An Essential Research Agenda
A research agenda for intelligent systems that will result in fundamental new capabilities for understanding the Earth system. Many aspects of geosciences pose novel problems for intelligent systems research. Geoscience data is challenging because it tends to be uncertain, intermittent, sparse, multiresolution, and multiscale. Geosciences processes and objects often have amorphous spatiotemporal boundaries. The lack of ground truth makes model evaluation, testing, and comparison difficult. Overcoming these challenges requires breakthroughs that would significantly transform intelligent systems, while greatly benefitting the geosciences in turn
The number of tree species on Earth
One of the most fundamental questions in ecology is how many species inhabit the Earth. However, due to massive logistical and financial challenges and taxonomic difficulties connected to the species concept definition, the global numbers of species, including those of important and well-studied life forms such as trees, still remain largely unknown. Here, based on global ground-sourced data, we estimate the total tree species richness at global, continental, and biome levels. Our results indicate that there are ∼73,000 tree species globally, among which ∼9,000 tree species are yet to be discovered. Roughly 40% of undiscovered tree species are in South America. Moreover, almost one-third of all tree species to be discovered may be rare, with very low populations and limited spatial distribution (likely in remote tropical lowlands and mountains). These findings highlight the vulnerability of global forest biodiversity to anthropogenic changes in land use and climate, which disproportionately threaten rare species and thus, global tree richness
- …