89 research outputs found

    Bayesian data assimilation in shape registration

    Get PDF
    In this paper we apply a Bayesian framework to the problem of geodesic curve matching. Given a template curve, the geodesic equations provide a mapping from initial conditions\ud for the conjugate momentum onto topologically equivalent shapes. Here, we aim to recover the well defined posterior distribution on the initial momentum which gives rise to observed points on the target curve; this is achieved by explicitly including a reparameterisation in the formulation. Appropriate priors are chosen for the functions which together determine this field and the positions of the observation points, the initial momentum p0 and the reparameterisation vector field v, informed by regularity results about the forward model. Having done this, we illustrate how Maximum Likelihood Estimators (MLEs) can be used to find regions of high posterior density, but also how we can apply recently developed MCMC methods on function spaces to characterise the whole of the posterior density. These illustrative examples also include scenarios where the posterior distribution is multimodal and irregular, leading us to the conclusion that knowledge of a state of global maximal posterior density does not always give us the whole picture, and full posterior sampling can give better quantification of likely states and the overall uncertainty inherent in the problem

    Improved fMRI-based Pain Prediction using Bayesian Group-wise Functional Registration

    Full text link
    In recent years, neuroimaging has undergone a paradigm shift, moving away from the traditional brain mapping approach toward developing integrated, multivariate brain models that can predict categories of mental events. However, large interindividual differences in brain anatomy and functional localization after standard anatomical alignment remain a major limitation in performing this analysis, as it leads to feature misalignment across subjects in subsequent predictive models

    Simulation of the spatial structure and cellular organization evolution of cell aggregates arranged in various simple geometries, using a kinetic monte carlo method applied to a lattice model

    Get PDF
    ilustraciones, graficasEsta tesis trata los modelos de morfogénesis, en particular los modelos de evolución guiada por contacto que son coherentes con la hipótesis de la adhesión diferencial. Se presenta una revisión de algunos modelos, sus principios biológicos subyacentes, la relevancia y aplicaciones en el marco de la bioimpresión, la ingeniería de tejidos y la bioconvergencia. Luego, se presentan los detalles de los modelos basados en métodos de Monte Carlo para profundizar más adelante en el modelo basados en algoritmos Kinetic Monte Carlo (KMC) , más específicamente, se describe en detalle un modelo KMC de autoaprendizaje (SL-KMC). Se presenta y explica la estructura algorítmica del código implementado, se evalúa el rendimiento del modelo y se compara con un modelo KMC tradicional. Finalmente, se realizan los procesos de calibración y validación, se observó que el modelo es capaz de replicar la evolución del sistema multicelular cuando las condiciones de energía interfacial del sistema simulado son similares a las del sistema de calibraciones. (Texto tomado de la fuente)This thesis treats the models for morphogenesis, in particular the contact-guided evolution models that are coherent with the differential adhesion hypothesis. A review of some models, their biological underpinning principles, the relevance and applications in the framework of bioprinting, tissue engineering and bioconvergence are presented. Then the details for the Monte Carlo methods-based models are presented to later deep dive into the Kinetic Monte Carlo (KMC) based model, and more specifically a Self-Learning KMC (SL-KMC) model is described to detail. The algorithmic structure of the implemented code is presented and explained, the model performance is assessed and compared with a traditional KMC model. Finally, the calibration and validation processes have been carried out, it was observed that the model is able to replicate the multicellular system evolution when the interfacial energy conditions of the simulated system are similar to those of the calibrations system.MaestríaMagíster en Ingeniería - Ingeniería Químic

    Statistical Fusion of Scientific Images

    Get PDF
    A practical and important class of scientific images are the 2D/3D images obtained from porous materials such as concretes, bone, active carbon, and glass. These materials constitute an important class of heterogeneous media possessing complicated microstructure that is difficult to describe qualitatively. However, they are not totally random and there is a mixture of organization and randomness that makes them difficult to characterize and study. In order to study different properties of porous materials, 2D/3D high resolution samples are required. But obtaining high resolution samples usually requires cutting, polishing and exposure to air, all of which affect the properties of the sample. Moreover, 3D samples obtained by Magnetic Resonance Imaging (MRI) are very low resolution and noisy. Therefore, artificial samples of porous media are required to be generated through a porous media reconstruction process. The recent contributions in the reconstruction task are either only based on a prior model, learned from statistical features of real high resolution training data, and generating samples from that model, or based on a prior model and the measurements. The main objective of this thesis is to some up with a statistical data fusion framework by which different images of porous materials at different resolutions and modalities are combined in order to generate artificial samples of porous media with enhanced resolution. The current super-resolution, multi-resolution and registration methods in image processing fail to provide a general framework for the porous media reconstruction purpose since they are usually based on finding an estimate rather than a typical sample, and also based on having the images from the same scene -- the case which is not true for porous media images. The statistical fusion approach that we propose here is based on a Bayesian framework by which a prior model learned from high resolution samples are combined with a measurement model defined based on the low resolution, coarse-scale information, to come up with a posterior model. We define a measurement model, in the non-hierachical and hierarchical image modeling framework, which describes how the low resolution information is asserted in the posterior model. Then, we propose a posterior sampling approach by which 2D posterior samples of porous media are generated from the posterior model. A more general framework that we propose here is asserting other constraints rather than the measurement in the model and then propose a constrained sampling strategy based on simulated annealing to generate artificial samples

    Error models in hydrogeology applications

    Get PDF
    Notre consommation en eau souterraine, en particulier comme eau potable ou pour l'irrigation, a considérablement augmenté au cours des années. De nombreux problèmes font alors leur apparition, allant de la prospection de nouvelles ressources à la remédiation des aquifères pollués. Indépendamment du problème hydrogéologique considéré, le principal défi reste la caractérisation des propriétés du sous-sol. Une approche stochastique est alors nécessaire afin de représenter cette incertitude en considérant de multiples scénarios géologiques et en générant un grand nombre de réalisations géostatistiques. Nous rencontrons alors la principale limitation de ces approches qui est le coût de calcul dû à la simulation des processus d'écoulements complexes pour chacune de ces réalisations. Dans la première partie de la thèse, ce problème est investigué dans le contexte de propagation de l'incertitude, oú un ensemble de réalisations est identifié comme représentant les propriétés du sous-sol. Afin de propager cette incertitude à la quantité d'intérêt tout en limitant le coût de calcul, les méthodes actuelles font appel à des modèles d'écoulement approximés. Cela permet l'identification d'un sous-ensemble de réalisations représentant la variabilité de l'ensemble initial. Le modèle complexe d'écoulement est alors évalué uniquement pour ce sousensemble, et, sur la base de ces réponses complexes, l'inférence est faite. Notre objectif est d'améliorer la performance de cette approche en utilisant toute l'information à disposition. Pour cela, le sous-ensemble de réponses approximées et exactes est utilisé afin de construire un modèle d'erreur, qui sert ensuite à corriger le reste des réponses approximées et prédire la réponse du modèle complexe. Cette méthode permet de maximiser l'utilisation de l'information à disposition sans augmentation perceptible du temps de calcul. La propagation de l'incertitude est alors plus précise et plus robuste. La stratégie explorée dans le premier chapitre consiste à apprendre d'un sous-ensemble de réalisations la relation entre les modèles d'écoulement approximé et complexe. Dans la seconde partie de la thèse, cette méthodologie est formalisée mathématiquement en introduisant un modèle de régression entre les réponses fonctionnelles. Comme ce problème est mal posé, il est nécessaire d'en réduire la dimensionnalité. Dans cette optique, l'innovation du travail présenté provient de l'utilisation de l'analyse en composantes principales fonctionnelles (ACPF), qui non seulement effectue la réduction de dimensionnalités tout en maximisant l'information retenue, mais permet aussi de diagnostiquer la qualité du modèle d'erreur dans cet espace fonctionnel. La méthodologie proposée est appliquée à un problème de pollution par une phase liquide nonaqueuse et les résultats obtenus montrent que le modèle d'erreur permet une forte réduction du temps de calcul tout en estimant correctement l'incertitude. De plus, pour chaque réponse approximée, une prédiction de la réponse complexe est fournie par le modèle d'erreur. Le concept de modèle d'erreur fonctionnel est donc pertinent pour la propagation de l'incertitude, mais aussi pour les problèmes d'inférence bayésienne. Les méthodes de Monte Carlo par chaîne de Markov (MCMC) sont les algorithmes les plus communément utilisés afin de générer des réalisations géostatistiques en accord avec les observations. Cependant, ces méthodes souffrent d'un taux d'acceptation très bas pour les problèmes de grande dimensionnalité, résultant en un grand nombre de simulations d'écoulement gaspillées. Une approche en deux temps, le "MCMC en deux étapes", a été introduite afin d'éviter les simulations du modèle complexe inutiles par une évaluation préliminaire de la réalisation. Dans la troisième partie de la thèse, le modèle d'écoulement approximé couplé à un modèle d'erreur sert d'évaluation préliminaire pour le "MCMC en deux étapes". Nous démontrons une augmentation du taux d'acceptation par un facteur de 1.5 à 3 en comparaison avec une implémentation classique de MCMC. Une question reste sans réponse : comment choisir la taille de l'ensemble d'entrainement et comment identifier les réalisations permettant d'optimiser la construction du modèle d'erreur. Cela requiert une stratégie itérative afin que, à chaque nouvelle simulation d'écoulement, le modèle d'erreur soit amélioré en incorporant les nouvelles informations. Ceci est développé dans la quatrième partie de la thèse, oú cette méthodologie est appliquée à un problème d'intrusion saline dans un aquifère côtier. -- Our consumption of groundwater, in particular as drinking water and for irrigation, has considerably increased over the years and groundwater is becoming an increasingly scarce and endangered resource. Nofadays, we are facing many problems ranging from water prospection to sustainable management and remediation of polluted aquifers. Independently of the hydrogeological problem, the main challenge remains dealing with the incomplete knofledge of the underground properties. Stochastic approaches have been developed to represent this uncertainty by considering multiple geological scenarios and generating a large number of realizations. The main limitation of this approach is the computational cost associated with performing complex of simulations in each realization. In the first part of the thesis, we explore this issue in the context of uncertainty propagation, where an ensemble of geostatistical realizations is identified as representative of the subsurface uncertainty. To propagate this lack of knofledge to the quantity of interest (e.g., the concentration of pollutant in extracted water), it is necessary to evaluate the of response of each realization. Due to computational constraints, state-of-the-art methods make use of approximate of simulation, to identify a subset of realizations that represents the variability of the ensemble. The complex and computationally heavy of model is then run for this subset based on which inference is made. Our objective is to increase the performance of this approach by using all of the available information and not solely the subset of exact responses. Two error models are proposed to correct the approximate responses follofing a machine learning approach. For the subset identified by a classical approach (here the distance kernel method) both the approximate and the exact responses are knofn. This information is used to construct an error model and correct the ensemble of approximate responses to predict the "expected" responses of the exact model. The proposed methodology makes use of all the available information without perceptible additional computational costs and leads to an increase in accuracy and robustness of the uncertainty propagation. The strategy explored in the first chapter consists in learning from a subset of realizations the relationship between proxy and exact curves. In the second part of this thesis, the strategy is formalized in a rigorous mathematical framework by defining a regression model between functions. As this problem is ill-posed, it is necessary to reduce its dimensionality. The novelty of the work comes from the use of functional principal component analysis (FPCA), which not only performs the dimensionality reduction while maximizing the retained information, but also allofs a diagnostic of the quality of the error model in the functional space. The proposed methodology is applied to a pollution problem by a non-aqueous phase-liquid. The error model allofs a strong reduction of the computational cost while providing a good estimate of the uncertainty. The individual correction of the proxy response by the error model leads to an excellent prediction of the exact response, opening the door to many applications. The concept of functional error model is useful not only in the context of uncertainty propagation, but also, and maybe even more so, to perform Bayesian inference. Monte Carlo Markov Chain (MCMC) algorithms are the most common choice to ensure that the generated realizations are sampled in accordance with the observations. Hofever, this approach suffers from lof acceptance rate in high dimensional problems, resulting in a large number of wasted of simulations. This led to the introduction of two-stage MCMC, where the computational cost is decreased by avoiding unnecessary simulation of the exact of thanks to a preliminary evaluation of the proposal. In the third part of the thesis, a proxy is coupled to an error model to provide an approximate response for the two-stage MCMC set-up. We demonstrate an increase in acceptance rate by a factor three with respect to one-stage MCMC results. An open question remains: hof do we choose the size of the learning set and identify the realizations to optimize the construction of the error model. This requires devising an iterative strategy to construct the error model, such that, as new of simulations are performed, the error model is iteratively improved by incorporating the new information. This is discussed in the fourth part of the thesis, in which we apply this methodology to a problem of saline intrusion in a coastal aquifer

    Bayesian Analysis of Continuous Curve Functions

    Get PDF
    We consider Bayesian analysis of continuous curve functions in 1D, 2D and 3D spaces. A fundamental feature of the analysis is that it is invariant under a simultaneous warping/re-parameterization of all target curves, as well as translation, rotation and scale of each individual if necessary. We introduce Bayesian models based on a special curve representation named Square Root Velocity Function (SRVF) introduced by Srivastava et al. (2011, IEEE PAMI). A Gaussian process model for the SRVFs of curves is proposed, and suitable prior models such as the Dirichlet distribution are employed for modeling the warping function as a cumulative distribution function. Simulation from posterior distribution is via Markov chain Monte Carlo methods, and credibility regions for mean curves, warping functions as well as nuisance parameters are obtained. Important Monte Carlo techniques such as simulated tempering are employed in order to overcome the problem of getting stuck in a local mode when high dimensional data get involved. We will illustrate the methodology with real data applications as well as simulation studies in 1D, 2D and 3D spaces

    Functional data classification and covariance estimation

    Get PDF
    Focusing on the analysis of functional data, the first part of this dissertation proposes three statistical models for functional data classification and applies them to a real problem of cervical pre-cancer diagnosis; the second part of the dissertation discusses covariance estimation of functional data. The functional data classification problem is motivated by the analysis of fluorescence spectroscopy, a type of clinical data used to quantitatively detect early-stage cervical cancer. Three statistical models are proposed for different purposes of the data analysis. The first one is a Bayesian probit model with variable selection, which extracts features from the fluorescence spectroscopy and selects a subset from these features for more accurate classification. The second model, designed for the practical purpose of building a more cost-effective device, is a functional generalized linear model with selection of functional predictors. This model selects a subset from the multiple functional predictors through a logistic regression with a grouped Lasso penalty. The first two models are appropriate for functional data that are not contaminated by random effects. However, in our real data, random effects caused by devices artifacts are too significant to be ignored. We therefore introduce the third model, the Bayesian hierarchical model with functional predictor selection, which extends the first two models for this more complex data. Besides retaining high classification accuracy, this model is able to select effective functional predictors while adjusting for the random effects. The second problem focused on by this dissertation is the covariance estimation of functional data. We discuss the properties of the covariance operator associated with Gaussian measure defined on a separable Hilbert Space and propose a suitable prior for Bayesian estimation. The limit of Inverse Wishart distribution as the dimension approaches infinity is also discussed. This research provides a new perspective for covariance estimation in functional data analysis
    corecore