95 research outputs found

    Error-constrained filtering for a class of nonlinear time-varying delay systems with non-gaussian noises

    Get PDF
    Copyright [2010] IEEE. This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Brunel University's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to [email protected]. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.In this technical note, the quadratic error-constrained filtering problem is formulated and investigated for discrete time-varying nonlinear systems with state delays and non-Gaussian noises. Both the Lipschitz-like and ellipsoid-bounded nonlinearities are considered. The non-Gaussian noises are assumed to be unknown, bounded, and confined to specified ellipsoidal sets. The aim of the addressed filtering problem is to develop a recursive algorithm based on the semi-definite programme method such that, for the admissible time-delays, nonlinear parameters and external bounded noise disturbances, the quadratic estimation error is not more than a certain optimized upper bound at every time step. The filter parameters are characterized in terms of the solution to a convex optimization problem that can be easily solved by using the semi-definite programme method. A simulation example is exploited to illustrate the effectiveness of the proposed design procedures.This work was supported in part by the Leverhulme Trust of the U.K., the Engineering and Physical Sciences Research Council (EPSRC) of the U.K. under Grant GR/S27658/01, the Royal Society of the U.K., the National Natural Science Foundation of China under Grant 61028008 and Grant 61074016, the Shanghai Natural Science Foundation of China under Grant 10ZR1421200, and the Alexander von Humboldt Foundation of Germany. Recommended by Associate Editor E. Fabre

    Computational Optimal Transport and Filtering on Riemannian manifolds

    Full text link
    In this paper we extend recent developments in computational optimal transport to the setting of Riemannian manifolds. In particular, we show how to learn optimal transport maps from samples that relate probability distributions defined on manifolds. Specializing these maps for sampling conditional probability distributions provides an ensemble approach for solving nonlinear filtering problems defined on such geometries. The proposed computational methodology is illustrated with examples of transport and nonlinear filtering on Lie groups, including the circle S1S^1, the special Euclidean group SE(2)SE(2), and the special orthogonal group SO(3)SO(3).Comment: 6 pages, 7 figure

    Motion-capture-based hand gesture recognition for computing and control

    Get PDF
    This dissertation focuses on the study and development of algorithms that enable the analysis and recognition of hand gestures in a motion capture environment. Central to this work is the study of unlabeled point sets in a more abstract sense. Evaluations of proposed methods focus on examining their generalization to users not encountered during system training. In an initial exploratory study, we compare various classification algorithms based upon multiple interpretations and feature transformations of point sets, including those based upon aggregate features (e.g. mean) and a pseudo-rasterization of the capture space. We find aggregate feature classifiers to be balanced across multiple users but relatively limited in maximum achievable accuracy. Certain classifiers based upon the pseudo-rasterization performed best among tested classification algorithms. We follow this study with targeted examinations of certain subproblems. For the first subproblem, we introduce the a fortiori expectation-maximization (AFEM) algorithm for computing the parameters of a distribution from which unlabeled, correlated point sets are presumed to be generated. Each unlabeled point is assumed to correspond to a target with independent probability of appearance but correlated positions. We propose replacing the expectation phase of the algorithm with a Kalman filter modified within a Bayesian framework to account for the unknown point labels which manifest as uncertain measurement matrices. We also propose a mechanism to reorder the measurements in order to improve parameter estimates. In addition, we use a state-of-the-art Markov chain Monte Carlo sampler to efficiently sample measurement matrices. In the process, we indirectly propose a constrained k-means clustering algorithm. Simulations verify the utility of AFEM against a traditional expectation-maximization algorithm in a variety of scenarios. In the second subproblem, we consider the application of positive definite kernels and the earth mover\u27s distance (END) to our work. Positive definite kernels are an important tool in machine learning that enable efficient solutions to otherwise difficult or intractable problems by implicitly linearizing the problem geometry. We develop a set-theoretic interpretation of ENID and propose earth mover\u27s intersection (EMI). a positive definite analog to ENID. We offer proof of EMD\u27s negative definiteness and provide necessary and sufficient conditions for ENID to be conditionally negative definite, including approximations that guarantee negative definiteness. In particular, we show that ENID is related to various min-like kernels. We also present a positive definite preserving transformation that can be applied to any kernel and can be used to derive positive definite EMD-based kernels, and we show that the Jaccard index is simply the result of this transformation applied to set intersection. Finally, we evaluate kernels based on EMI and the proposed transformation versus ENID in various computer vision tasks and show that END is generally inferior even with indefinite kernel techniques. Finally, we apply deep learning to our problem. We propose neural network architectures for hand posture and gesture recognition from unlabeled marker sets in a coordinate system local to the hand. As a means of ensuring data integrity, we also propose an extended Kalman filter for tracking the rigid pattern of markers on which the local coordinate system is based. We consider fixed- and variable-size architectures including convolutional and recurrent neural networks that accept unlabeled marker input. We also consider a data-driven approach to labeling markers with a neural network and a collection of Kalman filters. Experimental evaluations with posture and gesture datasets show promising results for the proposed architectures with unlabeled markers, which outperform the alternative data-driven labeling method

    Convexity Conditions of Kantorovich Function and Related Semi-infinite Linear Matrix Inequalities

    Full text link
    The Kantorovich function (xTAx)(xTA−1x)(x^TAx)(x^T A^{-1} x), where AA is a positive definite matrix, is not convex in general. From matrix/convex analysis point of view, it is interesting to address the question: When is this function convex? In this paper, we investigate the convexity of this function by the condition number of its matrix. In 2-dimensional space, we prove that the Kantorovich function is convex if and only if the condition number of its matrix is bounded above by 3+22,3+2\sqrt{2}, and thus the convexity of the function with two variables can be completely characterized by the condition number. The upper bound `3+223+2\sqrt{2} ' is turned out to be a necessary condition for the convexity of Kantorovich functions in any finite-dimensional spaces. We also point out that when the condition number of the matrix (which can be any dimensional) is less than or equal to 5+26,\sqrt{5+2\sqrt{6}}, the Kantorovich function is convex. Furthermore, we prove that this general sufficient convexity condition can be remarkably improved in 3-dimensional space. Our analysis shows that the convexity of the function is closely related to some modern optimization topics such as the semi-infinite linear matrix inequality or 'robust positive semi-definiteness' of symmetric matrices. In fact, our main result for 3-dimensional cases has been proved by finding an explicit solution range to some semi-infinite linear matrix inequalities.Comment: 24 page

    Learning and inference with Wasserstein metrics

    Get PDF
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 131-143).This thesis develops new approaches for three problems in machine learning, using tools from the study of optimal transport (or Wasserstein) distances between probability distributions. Optimal transport distances capture an intuitive notion of similarity between distributions, by incorporating the underlying geometry of the domain of the distributions. Despite their intuitive appeal, optimal transport distances are often difficult to apply in practice, as computing them requires solving a costly optimization problem. In each setting studied here, we describe a numerical method that overcomes this computational bottleneck and enables scaling to real data. In the first part, we consider the problem of multi-output learning in the presence of a metric on the output domain. We develop a loss function that measures the Wasserstein distance between the prediction and ground truth, and describe an efficient learning algorithm based on entropic regularization of the optimal transport problem. We additionally propose a novel extension of the Wasserstein distance from probability measures to unnormalized measures, which is applicable in settings where the ground truth is not naturally expressed as a probability distribution. We show statistical learning bounds for both the Wasserstein loss and its unnormalized counterpart. The Wasserstein loss can encourage smoothness of the predictions with respect to a chosen metric on the output space. We demonstrate this property on a real-data image tagging problem, outperforming a baseline that doesn't use the metric. In the second part, we consider the probabilistic inference problem for diffusion processes. Such processes model a variety of stochastic phenomena and appear often in continuous-time state space models. Exact inference for diffusion processes is generally intractable. In this work, we describe a novel approximate inference method, which is based on a characterization of the diffusion as following a gradient flow in a space of probability densities endowed with a Wasserstein metric. Existing methods for computing this Wasserstein gradient flow rely on discretizing the underlying domain of the diffusion, prohibiting their application to problems in more than several dimensions. In the current work, we propose a novel algorithm for computing a Wasserstein gradient flow that operates directly in a space of continuous functions, free of any underlying mesh. We apply our approximate gradient flow to the problem of filtering a diffusion, showing superior performance where standard filters struggle. Finally, we study the ecological inference problem, which is that of reasoning from aggregate measurements of a population to inferences about the individual behaviors of its members. This problem arises often when dealing with data from economics and political sciences, such as when attempting to infer the demographic breakdown of votes for each political party, given only the aggregate demographic and vote counts separately. Ecological inference is generally ill-posed, and requires prior information to distinguish a unique solution. We propose a novel, general framework for ecological inference that allows for a variety of priors and enables efficient computation of the most probable solution. Unlike previous methods, which rely on Monte Carlo estimates of the posterior, our inference procedure uses an efficient fixed point iteration that is linearly convergent. Given suitable prior information, our method can achieve more accurate inferences than existing methods. We additionally explore a sampling algorithm for estimating credible regions.by Charles Frogner.Ph. D

    Chandrasekhar-based maximum correntropy Kalman filtering with the adaptive kernel size selection

    Full text link
    This technical note is aimed to derive the Chandrasekhar-type recursion for the maximum correntropy criterion (MCC) Kalman filtering (KF). For the classical KF, the first Chandrasekhar difference equation was proposed at the beginning of 1970s. This is the alternative to the traditionally used Riccati recursion and it yields the so-called fast implementations known as the Morf-Sidhu-Kailath-Sayed KF algorithms. They are proved to be computationally cheap because of propagating the matrices of a smaller size than n×nn \times n error covariance matrix in the Riccati recursion. The problem of deriving the Chandrasekhar-type recursion within the MCC estimation methodology has never been raised yet in engineering literature. In this technical note, we do the first step and derive the Chandrasekhar MCC-KF estimators for the case of adaptive kernel size selection strategy, which implies a constant scalar adjusting weight. Numerical examples substantiate a practical feasibility of the newly suggested MCC-KF implementations and correctness of the presented theoretical derivations

    Learning Dynamics from Data Using Optimal Transport Techniques and Applications

    Get PDF
    Optimal Transport has been studied widely in recent years, the concept of Wasserstein distance brings a lot of applications in computational mathematics, machine learning, engineering, even finance areas. Meanwhile, people are gradually realizing that as the amount of data as well as the needs of utilizing data increase vastly, data-driven models have great potentials in real-world applications. In this thesis, we apply the theories of OT and design data-driven algorithms to form and compute various OT problems. We also build a framework to learn inverse OT problem. Furthermore, we develop OT and deep learning based models to solve stochastic differential equations, optimal control, mean field games related problems, all in data-driven settings. In Chapter 2, we provide necessary mathematical concepts and results that form the basis of this thesis. It contains brief surveys of optimal transport, stochastic differential equations, Fokker-Planck equations, deep learning, optimal controls and mean field games. Chapter 3 to Chapter 5 present several scalable algorithms to handle optimal transport problems within different settings. Specifically, Chapter 3 shows a new saddle scheme and learning strategy for computing the Wasserstein geodesic, as well as the Wasserstein distance and OT map between two probability distributions in high dimensions. We parametrize the map and Lagrange multipliers as neural networks. We demonstrate the performance of our algorithms through a series of experiments with both synthetic and realistic data. Chapter 4 presents a scalable algorithm for computing the Monge map between two probability distributions since computing the Monge maps remains challenging, in spite of the rapid developments of the numerical methods for optimal transport problems. Similarly, we formulate the problem as a mini-max problem and solve it via deep learning. The performance of our algorithms is demonstrated through a series of experiments with both synthetic and realistic data. In Chapter 5 we study OT problem in an inverse view, which we also call Inverse OT (IOT) problem. IOT also refers to the problem of learning the cost function for OT from observed transport plan or its samples. We derive an unconstrained convex optimization formulation of the inverse OT problem. We provide a comprehensive characterization of the properties of inverse OT, including uniqueness of solutions. We also develop two numerical algorithms, one is a fast matrix scaling method based on the Sinkhorn-Knopp algorithm for discrete OT, and the other one is a learning based algorithm that parameterizes the cost function as a deep neural network for continuous OT. Our numerical results demonstrate promising efficiency and accuracy advantages of the proposed algorithms over existing state-of-the-art methods. In Chapter 6 we propose a novel method using the weak form of Fokker Planck Equation (FPE) --- a partial differential equation --- to describe the density evolution of data in a sampled form, which is then combined with Wasserstein generative adversarial network (WGAN) in the training process. In such a sample-based framework we are able to learn the nonlinear dynamics from aggregate data without explicitly solving FPE. We demonstrate our approach in the context of a series of synthetic and real-world data sets. Chapter 7 introduces the application of OT and neural networks in optimal density control. Particularly, we parametrize the control strategy via neural networks, and provide an algorithm to learn the strategy that can drive samples following one distribution to new locations following target distribution. We demonstrate our method in both synthetic and realistic experiments, where we also consider perturbation fields. Finally Chapter 8 presents applications of mean field game in generative modeling and finance area. With more details, we build a GAN framework upon mean field game to generate desired distribution starting with white noise, we also investigate its connection to OT. Moreover, we apply mean field game theories to study the equilibrium trading price in stock markets, we demonstrate the theoretical result by conducting experiments on real trading data.Ph.D
    • …
    corecore