3,324 research outputs found

    Automatic Bayesian Density Analysis

    Full text link
    Making sense of a dataset in an automatic and unsupervised fashion is a challenging problem in statistics and AI. Classical approaches for {exploratory data analysis} are usually not flexible enough to deal with the uncertainty inherent to real-world data: they are often restricted to fixed latent interaction models and homogeneous likelihoods; they are sensitive to missing, corrupt and anomalous data; moreover, their expressiveness generally comes at the price of intractable inference. As a result, supervision from statisticians is usually needed to find the right model for the data. However, since domain experts are not necessarily also experts in statistics, we propose Automatic Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible at large. Specifically, ABDA allows for automatic and efficient missing value estimation, statistical data type and likelihood discovery, anomaly detection and dependency structure mining, on top of providing accurate density estimation. Extensive empirical evidence shows that ABDA is a suitable tool for automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19

    Deep Ensembles to Improve Uncertainty Quantification of Statistical Downscaling Models under Climate Change Conditions

    Full text link
    Recently, deep learning has emerged as a promising tool for statistical downscaling, the set of methods for generating high-resolution climate fields from coarse low-resolution variables. Nevertheless, their ability to generalize to climate change conditions remains questionable, mainly due to the stationarity assumption. We propose deep ensembles as a simple method to improve the uncertainty quantification of statistical downscaling models. By better capturing uncertainty, statistical downscaling models allow for superior planning against extreme weather events, a source of various negative social and economic impacts. Since no observational future data exists, we rely on a pseudo reality experiment to assess the suitability of deep ensembles for quantifying the uncertainty of climate change projections. Deep ensembles allow for a better risk assessment, highly demanded by sectoral applications to tackle climate change.Comment: Accepted at the ICLR 2023 Tackling Climate Change with Machine Learning Worksho

    Spatiotemporal Graph Neural Networks with Uncertainty Quantification for Traffic Incident Risk Prediction

    Full text link
    Predicting traffic incident risks at granular spatiotemporal levels is challenging. The datasets predominantly feature zero values, indicating no incidents, with sporadic high-risk values for severe incidents. Notably, a majority of current models, especially deep learning methods, focus solely on estimating risk values, overlooking the uncertainties arising from the inherently unpredictable nature of incidents. To tackle this challenge, we introduce the Spatiotemporal Zero-Inflated Tweedie Graph Neural Networks (STZITD-GNNs). Our model merges the reliability of traditional statistical models with the flexibility of graph neural networks, aiming to precisely quantify uncertainties associated with road-level traffic incident risks. This model strategically employs a compound model from the Tweedie family, as a Poisson distribution to model risk frequency and a Gamma distribution to account for incident severity. Furthermore, a zero-inflated component helps to identify the non-incident risk scenarios. As a result, the STZITD-GNNs effectively capture the dataset's skewed distribution, placing emphasis on infrequent but impactful severe incidents. Empirical tests using real-world traffic data from London, UK, demonstrate that our model excels beyond current benchmarks. The forte of STZITD-GNN resides not only in its accuracy but also in its adeptness at curtailing uncertainties, delivering robust predictions over short (7 days) and extended (14 days) timeframes

    Computing Interpretable Representations of Cell Morphodynamics

    Get PDF
    Shape changes (morphodynamics) are one of the principal ways cells interact with their environments and perform key intrinsic behaviours like division. These dynamics arise from a myriad of complex signalling pathways that often organise with emergent simplicity to carry out critical functions including predation, collaboration and migration. A powerful method for analysis can therefore be to quantify this emergent structure, bypassing the low-level complexity. Enormous image datasets are now available to mine. However, it can be difficult to uncover interpretable representations of the global organisation of these heterogeneous dynamic processes. Here, such representations were developed for interpreting morphodynamics in two key areas: mode of action (MoA) comparison for drug discovery (developed using the economically devastating Asian soybean rust crop pathogen) and 3D migration of immune system T cells through extracellular matrices (ECMs). For MoA comparison, population development over a 2D space of shapes (morphospace) was described using two models with condition-dependent parameters: a top-down model of diffusive development over Waddington-type landscapes, and a bottom-up model of tip growth. A variety of landscapes were discovered, describing phenotype transitions during growth, and possible perturbations in the tip growth machinery that cause this variation were identified. For interpreting T cell migration, a new 3D shape descriptor that incorporates key polarisation information was developed, revealing low-dimensionality of shape, and the distinct morphodynamics of run-and-stop modes that emerge at minute timescales were mapped. Periodically oscillating morphodynamics that include retrograde deformation flows were found to underlie active translocation (run mode). Overall, it was found that highly interpretable representations could be uncovered while still leveraging the enormous discovery power of deep learning algorithms. The results show that whole-cell morphodynamics can be a convenient and powerful place to search for structure, with potentially life-saving applications in medicine and biocide discovery as well as immunotherapeutics.Open Acces

    An Introduction to the Calibration of Computer Models

    Full text link
    In the context of computer models, calibration is the process of estimating unknown simulator parameters from observational data. Calibration is variously referred to as model fitting, parameter estimation/inference, an inverse problem, and model tuning. The need for calibration occurs in most areas of science and engineering, and has been used to estimate hard to measure parameters in climate, cardiology, drug therapy response, hydrology, and many other disciplines. Although the statistical method used for calibration can vary substantially, the underlying approach is essentially the same and can be considered abstractly. In this survey, we review the decisions that need to be taken when calibrating a model, and discuss a range of computational methods that can be used to compute Bayesian posterior distributions

    A Comprehensive Survey on Rare Event Prediction

    Full text link
    Rare event prediction involves identifying and forecasting events with a low probability using machine learning and data analysis. Due to the imbalanced data distributions, where the frequency of common events vastly outweighs that of rare events, it requires using specialized methods within each step of the machine learning pipeline, i.e., from data processing to algorithms to evaluation protocols. Predicting the occurrences of rare events is important for real-world applications, such as Industry 4.0, and is an active research area in statistical and machine learning. This paper comprehensively reviews the current approaches for rare event prediction along four dimensions: rare event data, data processing, algorithmic approaches, and evaluation approaches. Specifically, we consider 73 datasets from different modalities (i.e., numerical, image, text, and audio), four major categories of data processing, five major algorithmic groupings, and two broader evaluation approaches. This paper aims to identify gaps in the current literature and highlight the challenges of predicting rare events. It also suggests potential research directions, which can help guide practitioners and researchers.Comment: 44 page

    TRU-NET: A Deep Learning Approach to High Resolution Prediction of Rainfall

    Get PDF
    Climate models (CM) are used to evaluate the impact of climate change on the risk of floods and strong precipitation events. However, these numerical simulators have difficulties representing precipitation events accurately, mainly due to limited spatial resolution when simulating multi-scale dynamics in the atmosphere. To improve the prediction of high resolution precipitation we apply a Deep Learning (DL) approach using an input of CM simulations of the model fields (weather variables) that are more predictable than local precipitation. To this end, we present TRU-NET (Temporal Recurrent U-Net), an encoder-decoder model featuring a novel 2D cross attention mechanism between contiguous convolutional-recurrent layers to effectively model multi-scale spatio-temporal weather processes. We use a conditional-continuous loss function to capture the zero-skewed %extreme event patterns of rainfall. Experiments show that our model consistently attains lower RMSE and MAE scores than a DL model prevalent in short term precipitation prediction and improves upon the rainfall predictions of a state-of-the-art dynamical weather model. Moreover, by evaluating the performance of our model under various, training and testing, data formulation strategies, we show that there is enough data for our deep learning approach to output robust, high-quality results across seasons and varying regions
    corecore