3,324 research outputs found
Automatic Bayesian Density Analysis
Making sense of a dataset in an automatic and unsupervised fashion is a
challenging problem in statistics and AI. Classical approaches for {exploratory
data analysis} are usually not flexible enough to deal with the uncertainty
inherent to real-world data: they are often restricted to fixed latent
interaction models and homogeneous likelihoods; they are sensitive to missing,
corrupt and anomalous data; moreover, their expressiveness generally comes at
the price of intractable inference. As a result, supervision from statisticians
is usually needed to find the right model for the data. However, since domain
experts are not necessarily also experts in statistics, we propose Automatic
Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible
at large. Specifically, ABDA allows for automatic and efficient missing value
estimation, statistical data type and likelihood discovery, anomaly detection
and dependency structure mining, on top of providing accurate density
estimation. Extensive empirical evidence shows that ABDA is a suitable tool for
automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
Deep Ensembles to Improve Uncertainty Quantification of Statistical Downscaling Models under Climate Change Conditions
Recently, deep learning has emerged as a promising tool for statistical
downscaling, the set of methods for generating high-resolution climate fields
from coarse low-resolution variables. Nevertheless, their ability to generalize
to climate change conditions remains questionable, mainly due to the
stationarity assumption. We propose deep ensembles as a simple method to
improve the uncertainty quantification of statistical downscaling models. By
better capturing uncertainty, statistical downscaling models allow for superior
planning against extreme weather events, a source of various negative social
and economic impacts. Since no observational future data exists, we rely on a
pseudo reality experiment to assess the suitability of deep ensembles for
quantifying the uncertainty of climate change projections. Deep ensembles allow
for a better risk assessment, highly demanded by sectoral applications to
tackle climate change.Comment: Accepted at the ICLR 2023 Tackling Climate Change with Machine
Learning Worksho
Spatiotemporal Graph Neural Networks with Uncertainty Quantification for Traffic Incident Risk Prediction
Predicting traffic incident risks at granular spatiotemporal levels is
challenging. The datasets predominantly feature zero values, indicating no
incidents, with sporadic high-risk values for severe incidents. Notably, a
majority of current models, especially deep learning methods, focus solely on
estimating risk values, overlooking the uncertainties arising from the
inherently unpredictable nature of incidents. To tackle this challenge, we
introduce the Spatiotemporal Zero-Inflated Tweedie Graph Neural Networks
(STZITD-GNNs). Our model merges the reliability of traditional statistical
models with the flexibility of graph neural networks, aiming to precisely
quantify uncertainties associated with road-level traffic incident risks. This
model strategically employs a compound model from the Tweedie family, as a
Poisson distribution to model risk frequency and a Gamma distribution to
account for incident severity. Furthermore, a zero-inflated component helps to
identify the non-incident risk scenarios. As a result, the STZITD-GNNs
effectively capture the dataset's skewed distribution, placing emphasis on
infrequent but impactful severe incidents. Empirical tests using real-world
traffic data from London, UK, demonstrate that our model excels beyond current
benchmarks. The forte of STZITD-GNN resides not only in its accuracy but also
in its adeptness at curtailing uncertainties, delivering robust predictions
over short (7 days) and extended (14 days) timeframes
Computing Interpretable Representations of Cell Morphodynamics
Shape changes (morphodynamics) are one of the principal ways cells interact with their environments and perform key intrinsic behaviours like division. These dynamics arise from a myriad of complex signalling pathways that often organise with emergent simplicity to carry out critical functions including predation, collaboration and migration. A powerful method for analysis can therefore be to quantify this emergent structure, bypassing the low-level complexity. Enormous image datasets are now available to mine. However, it can be difficult to uncover interpretable representations of the global organisation of these heterogeneous dynamic processes. Here, such representations were developed for interpreting morphodynamics in two key areas: mode of action (MoA) comparison for drug discovery (developed using the economically devastating Asian soybean rust crop pathogen) and 3D migration of immune system T cells through extracellular matrices (ECMs). For MoA comparison, population development over a 2D space of shapes (morphospace) was described using two models with condition-dependent parameters: a top-down model of diffusive development over Waddington-type landscapes, and a bottom-up model of tip growth. A variety of landscapes were discovered, describing phenotype transitions during growth, and possible perturbations in the tip growth machinery that cause this variation were identified. For interpreting T cell migration, a new 3D shape descriptor that incorporates key polarisation information was developed, revealing low-dimensionality of shape, and the distinct morphodynamics of run-and-stop modes that emerge at minute timescales were mapped. Periodically oscillating morphodynamics that include retrograde deformation flows were found to underlie active translocation (run mode). Overall, it was found that highly interpretable representations could be uncovered while still leveraging the enormous discovery power of deep learning algorithms. The results show that whole-cell morphodynamics can be a convenient and powerful place to search for structure, with potentially life-saving applications in medicine and biocide discovery as well as immunotherapeutics.Open Acces
An Introduction to the Calibration of Computer Models
In the context of computer models, calibration is the process of estimating
unknown simulator parameters from observational data. Calibration is variously
referred to as model fitting, parameter estimation/inference, an inverse
problem, and model tuning. The need for calibration occurs in most areas of
science and engineering, and has been used to estimate hard to measure
parameters in climate, cardiology, drug therapy response, hydrology, and many
other disciplines. Although the statistical method used for calibration can
vary substantially, the underlying approach is essentially the same and can be
considered abstractly. In this survey, we review the decisions that need to be
taken when calibrating a model, and discuss a range of computational methods
that can be used to compute Bayesian posterior distributions
A Comprehensive Survey on Rare Event Prediction
Rare event prediction involves identifying and forecasting events with a low
probability using machine learning and data analysis. Due to the imbalanced
data distributions, where the frequency of common events vastly outweighs that
of rare events, it requires using specialized methods within each step of the
machine learning pipeline, i.e., from data processing to algorithms to
evaluation protocols. Predicting the occurrences of rare events is important
for real-world applications, such as Industry 4.0, and is an active research
area in statistical and machine learning. This paper comprehensively reviews
the current approaches for rare event prediction along four dimensions: rare
event data, data processing, algorithmic approaches, and evaluation approaches.
Specifically, we consider 73 datasets from different modalities (i.e.,
numerical, image, text, and audio), four major categories of data processing,
five major algorithmic groupings, and two broader evaluation approaches. This
paper aims to identify gaps in the current literature and highlight the
challenges of predicting rare events. It also suggests potential research
directions, which can help guide practitioners and researchers.Comment: 44 page
TRU-NET: A Deep Learning Approach to High Resolution Prediction of Rainfall
Climate models (CM) are used to evaluate the impact of climate change on the
risk of floods and strong precipitation events. However, these numerical
simulators have difficulties representing precipitation events accurately,
mainly due to limited spatial resolution when simulating multi-scale dynamics
in the atmosphere. To improve the prediction of high resolution precipitation
we apply a Deep Learning (DL) approach using an input of CM simulations of the
model fields (weather variables) that are more predictable than local
precipitation. To this end, we present TRU-NET (Temporal Recurrent U-Net), an
encoder-decoder model featuring a novel 2D cross attention mechanism between
contiguous convolutional-recurrent layers to effectively model multi-scale
spatio-temporal weather processes. We use a conditional-continuous loss
function to capture the zero-skewed %extreme event patterns of rainfall.
Experiments show that our model consistently attains lower RMSE and MAE scores
than a DL model prevalent in short term precipitation prediction and improves
upon the rainfall predictions of a state-of-the-art dynamical weather model.
Moreover, by evaluating the performance of our model under various, training
and testing, data formulation strategies, we show that there is enough data for
our deep learning approach to output robust, high-quality results across
seasons and varying regions
- …