153 research outputs found

    A survey of the Schr\"odinger problem and some of its connections with optimal transport

    Full text link
    This article is aimed at presenting the Schr\"odinger problem and some of its connections with optimal transport. We hope that it can be used as a basic user's guide to Schr\"odinger problem. We also give a survey of the related literature. In addition, some new results are proved.Comment: To appear in Discrete \& Continuous Dynamical Systems - Series A. Special issue on optimal transpor

    Transport optimal entropique : applications dans le contexte de l'inversion de forme d'onde.

    Get PDF
    Seismic tomography aims at inferring physical properties and reconstructing quantitatively the “model”, i.e. structures of the Earth interior, from the mechanical waves – radiated by natural and man-made seismic sources – that are recorded at the surface by receivers in the form of seis-mograms. Over the past decades, Full-Waveform inversion (FWI) has been actively developed in both academia and industry, and has proven to be a powerful tool that dramatically improved the capability to estimate physical properties and structures of various geological targets from global to local scales.Full-waveform inversion (FWI) is formulated as a nonlinear, PDE-based optimisation problem that is classically solved by iterative minimisation of an objective function – measuring the misÿt bet-ween synthetic and observed seismic waveforms – using adjoint-based solution methods. Adjoint-based solution methods allow the computation of the derivative of the objective function with respect to the model parameters by combining the synthetic forward waveÿeld and an adjoint waveÿeld governed by a set of adjoint equations and adjoint subsidiary conditions.In practice, however, solving FWI problems using local, nonlinear optimisation methods is facing challenges that preclude routine use. The quality of the inversion, simultaneously dealing with long and short wavelengths information, is degraded by the lack of low frequencies and also depends on a good starting model. These limitations are linked to the ill-posed nature of the inverse problem which can be easily trapped into a local minimum.One proposed research direction to reduce the dependency on the initial model, is to replace the classical least-squares based misÿt by other objective functions – possibly involving nonlinear transformation of the seismic signal –hence promoting the convexity and trying to enlarge the basin of attraction of the global minimum.Optimal Transport (OT) theory has recently been use in inverse problems and machine learning. Optimal Transport lifts the properties of the squared Euclidean distance to the space of probability distributions. The optimal value (squared) of the transport itself deÿnes a distance called the 2-Wasserstein distance. This quantity is again convex but now on the set of probability distributions.This trend is already active for FWI, this thesis is part of it. The OT approach is still largely open on three fronts : Seismic Waveforms are not probability distributions, lacking positivity and nor-malised total mass. Convexity with respect to the model is not guaranteed and ÿnally the actual computation of the OT distance is not cheap.In this work we use and combine – from an academic point of view – two recent extensions of OT in the context of FWI. First the “unbalanced” OT distance, which rigorously deÿnes a distance on the set of positive Radon measure thus by-passing the data normalisation issue (but not the positivity problem). Then, the entropic regularisation OT framework and in particular the simple and easy to compute variant called Sinkhorn divergence providing a good approximation of the 2-Wasserstein distance. The Sinkhorn divergence can be naturally extended to unbalanced OT.We use these tools to construct and implement our unbalanced OT misÿt and discuss its use in the context of full-waveform inversion through a number of academic examples and classical benchmark problems.Les méthodes de tomographie sismique visent à inférer les propriétés physique et reconstruire le “modèle”, i.e. les structures de l’intérieur de la Terre, à partir des ondes mécaniques - radiées par des sources naturelles ou anthropogéniques - enregistrées par des récepteurs en surface sous la forme de sismogrammes. Les méthodes d’inversion de formes d’onde ont été activement déve-loppées dans les contextes académiques et industriels et sont devenus des outils puissants pour améliorer l’estimation des propriétés physiques et des structures d’objets géologiques depuis les échelles globales jusqu’aux échelles locales de la géophysique d’exploration.L’inversion de formes d’onde est formulée comme un problème d’optimisation non linéaire, as-socié à un système d’équations aux dérivées partielles. Il est classiquement résolu par des mé-thodes d’optimisation locale via la minimisation itérative d’une fonction coût qui mesure la di˙é-rence entre les sismogrammes observés et synthétiques, et utilisent des méthodes d’état adjoint. Les méthodes d’état adjoint permettent le calcul des dérivées de la fonction coût par rapport aux paramètres du modèle en combinant le champ d’onde direct et un champ d’onde adjoint gou-verné par un système d’équations adjointes et des conditions adjointes complémentaires.Les méthodes d’inversion de forme d’onde, qui inversent simultanément les courtes et grandes longueurs d’onde, sou˙rent malheureusement en pratique de di°cultés qui restreignent leur uti-lisation pratique. Leurs capacités se détériorent du fait du déÿcit en basse fréquence des obser-vations et d’un bon modèle initial. Des limitations qui sont associées à la nature mal posée du problème inverse qui peut facilement être piégé dans un minimum local.Une direction proposée, aÿn de réduire la dépendance vis à vis du modèle initial, est de rem-placer la fonction classique, basée sur une distance de type moindres carrés, par de nouvelles fonctions coûts, pouvant impliquer une transformation non linéaire du signal, aÿn de promouvoir la convexité et élargir le bassin d’attraction du minimum global.La théorie du Transport Optimal (OT) a récemment été utilisée dans le cadre des problèmes in-verses et de l’apprentissage automatique. Le transport optimal généralise les propriétés de la distance euclidienne au carré à l’espace des distributions de probabilité. La valeur optimale (au carré) du transport lui-même déÿnit une distance appelée distance 2-Wasserstein. Cette quantité est à nouveau convexe mais maintenant sur l’ensemble des distributions de probabilité.Le Transport Optimal est déjà utilisé en FWI, cette thèse en fait partie. L’approche OT est encore largement ouverte sur trois fronts : les formes d’onde sismiques ne sont ni positives ni de masse totale normalisée. La convexité par rapport au modèle n’est pas garantie et ÿnalement le calcul réel de la distance OT est coûteuse.Dans ce travail, nous utilisons et combinons - d’un point de vue académique - deux extensions ré-centes d’OT dans le contexte FWI. D’abord la distance OT “non-équilibrée”, qui déÿnit rigoureuse-ment une distance sur l’ensemble des mesures de Radon positives contournant ainsi le problème de normalisation des données (mais pas le problème de positivité). Puis le cadre du Transport Optimal entropique et en particulier la variante simple et facile à calculer appelée divergence de Sinkhorn fournissant une bonne approximation de la distance 2-Wasserstein. La divergence de Sinkhorn peut être naturellement étendue au transport “non-équilibré”.Nous utilisons ces outils pour construire et mettre en œuvre une fonction coût OT “non-équilibrée”. Nous discutons de son utilisation dans le contexte FWI au travers d’un certain nombre d’exemples académiques et de problèmes de référence classiques

    Complexity penalized methods for structured and unstructured data

    Get PDF
    A fundamental goal of statisticians is to make inferences from the sample about characteristics of the underlying population. This is an inverse problem, since we are trying to recover a feature of the input with the availability of observations on an output. Towards this end, we consider complexity penalized methods, because they balance goodness of fit and generalizability of the solution. The data from the underlying population may come in diverse formats - structured or unstructured - such as probability distributions, text tokens, or graph characteristics. Depending on the defining features of the problem we can chose the appropriate complexity penalized approach, and assess the quality of the estimate produced by it. Favorable characteristics are strong theoretical guarantees of closeness to the true value and interpretability. Our work fits within this framework and spans the areas of simulation optimization, text mining and network inference. The first problem we consider is model calibration under the assumption that given a hypothesized input model, we can use stochastic simulation to obtain its corresponding output observations. We formulate it as a stochastic program by maximizing the entropy of the input distribution subject to moment matching. We then propose an iterative scheme via simulation to approximately solve it. We prove convergence of the proposed algorithm under appropriate conditions and demonstrate the performance via numerical studies. The second problem we consider is summarizing text documents through an inferred set of topics. We propose a frequentist reformulation of a Bayesian regularization scheme. Through our complexity-penalized perspective we lend further insight into the nature of the loss function and the regularization achieved through the priors in the Bayesian formulation. The third problem is concerned with the impact of sampling on the degree distribution of a network. Under many sampling designs, we have a linear inverse problem characterized by an ill-conditioned matrix. We investigate the theoretical properties of an approximate solution for the degree distribution found by regularizing the solution of the ill-conditioned least squares objective. Particularly, we study the rate at which the penalized solution tends to the true value as a function of network size and sampling rate

    Cosmic cartography

    Get PDF
    The cosmic origin and evolution is encoded in the large-scale matter distribution observed in astronomical surveys. Galaxy redshift surveys have become in the recent years one of the best probes for cosmic large-scale structures. They are complementary to other information sources like the cosmic microwave background, since they trace a different epoch of the Universe, the time after reionization at which the Universe became transparent, covering about the last twelve billion years. Regarding that the Universe is about thirteen billion years old, galaxy surveys cover a huge range of time, even if the sensitivity limitations of the detectors do not permit to reach the furthermost sources in the transparent Universe. This makes galaxy surveys extremely interesting for cosmological evolution studies. The observables, galaxy position in the sky, galaxy ma gnitude and redshift, however, give an incomplete representation of the real structures in the Universe, not only due to the limitations and uncertainties in the measurements, but also due to their biased nature. They trace the underlying continuous dark matter field only partially being a discrete sample of the luminous baryonic distribution. In addition, galaxy catalogues are plagued by many complications. Some have a physical foundation, as mentioned before, others are due to the observation process. The problem of reconstructing the underlying density field, which permits to make cosmological studies, thus requires a statistical approach. This thesis describes a cosmic cartography project. The necessary concepts, mathematical frame-work, and numerical algorithms are thoroughly analyzed. On that basis a Bayesian software tool is implemented. The resulting Argo-code allows to investigate the characteristics of the large-scale cosmological structure with unprecedented accuracy and flexibility. This is achieved by jointly estimating the large-scale density along with a variety of other parameters ---such as the cosmic flow, the small-scale peculiar velocity field, and the power-spectrum--- from the information provided by galaxy redshift surveys. Furthermore, Argo is capable of dealing with many observational issues like mask-effects, galaxy selection criteria, blurring and noise in a very efficient implementation of an operator based formalism which was carefully derived for this purpose. Thanks to the achieved high efficiency of Argo the application of iterative sampling algorithms based on Markov Chain Monte Carlo is now possible. This will ultimately lead to a full description of the matter distribution with all its relevant parameters like velocities, power spectra, galaxy bias, etc., including the associated uncertainties. Some applications are shown, in which such techniques are used. A rejection sampling scheme is successfully applied to correct for the observational redshift-distortions effect which is especially severe in regimes of non-linear structure formation, causing the so-called finger-of-god effect. Also a Gibbs-sampling algorithm for power-spectrum determination is presented and some preliminary results are shown in which the correct level and shape of the power-spectrum is recovered solely from the data. We present in an additional appendix the gravitational collapse and subsequent neutrino-driven explosion of the low-mass end of stars that undergo core-collapse Supernovae. We obtain results which are for the first time compatible with the Crab Nebula
    • …
    corecore