125 research outputs found

    Motion-capture-based hand gesture recognition for computing and control

    Get PDF
    This dissertation focuses on the study and development of algorithms that enable the analysis and recognition of hand gestures in a motion capture environment. Central to this work is the study of unlabeled point sets in a more abstract sense. Evaluations of proposed methods focus on examining their generalization to users not encountered during system training. In an initial exploratory study, we compare various classification algorithms based upon multiple interpretations and feature transformations of point sets, including those based upon aggregate features (e.g. mean) and a pseudo-rasterization of the capture space. We find aggregate feature classifiers to be balanced across multiple users but relatively limited in maximum achievable accuracy. Certain classifiers based upon the pseudo-rasterization performed best among tested classification algorithms. We follow this study with targeted examinations of certain subproblems. For the first subproblem, we introduce the a fortiori expectation-maximization (AFEM) algorithm for computing the parameters of a distribution from which unlabeled, correlated point sets are presumed to be generated. Each unlabeled point is assumed to correspond to a target with independent probability of appearance but correlated positions. We propose replacing the expectation phase of the algorithm with a Kalman filter modified within a Bayesian framework to account for the unknown point labels which manifest as uncertain measurement matrices. We also propose a mechanism to reorder the measurements in order to improve parameter estimates. In addition, we use a state-of-the-art Markov chain Monte Carlo sampler to efficiently sample measurement matrices. In the process, we indirectly propose a constrained k-means clustering algorithm. Simulations verify the utility of AFEM against a traditional expectation-maximization algorithm in a variety of scenarios. In the second subproblem, we consider the application of positive definite kernels and the earth mover\u27s distance (END) to our work. Positive definite kernels are an important tool in machine learning that enable efficient solutions to otherwise difficult or intractable problems by implicitly linearizing the problem geometry. We develop a set-theoretic interpretation of ENID and propose earth mover\u27s intersection (EMI). a positive definite analog to ENID. We offer proof of EMD\u27s negative definiteness and provide necessary and sufficient conditions for ENID to be conditionally negative definite, including approximations that guarantee negative definiteness. In particular, we show that ENID is related to various min-like kernels. We also present a positive definite preserving transformation that can be applied to any kernel and can be used to derive positive definite EMD-based kernels, and we show that the Jaccard index is simply the result of this transformation applied to set intersection. Finally, we evaluate kernels based on EMI and the proposed transformation versus ENID in various computer vision tasks and show that END is generally inferior even with indefinite kernel techniques. Finally, we apply deep learning to our problem. We propose neural network architectures for hand posture and gesture recognition from unlabeled marker sets in a coordinate system local to the hand. As a means of ensuring data integrity, we also propose an extended Kalman filter for tracking the rigid pattern of markers on which the local coordinate system is based. We consider fixed- and variable-size architectures including convolutional and recurrent neural networks that accept unlabeled marker input. We also consider a data-driven approach to labeling markers with a neural network and a collection of Kalman filters. Experimental evaluations with posture and gesture datasets show promising results for the proposed architectures with unlabeled markers, which outperform the alternative data-driven labeling method

    Privacy, Space and Time: a Survey on Privacy-Preserving Continuous Data Publishing

    Get PDF
    Sensors, portable devices, and location-based services, generate massive amounts of geo-tagged, and/or location- and user-related data on a daily basis. The manipulation of such data is useful in numerous application domains, e.g., healthcare, intelligent buildings, and traffic monitoring, to name a few. A high percentage of these data carry information of users\u27 activities and other personal details, and thus their manipulation and sharing arise concerns about the privacy of the individuals involved. To enable the secure—from the users\u27 privacy perspective—data sharing, researchers have already proposed various seminal techniques for the protection of users\u27 privacy. However, the continuous fashion in which data are generated nowadays, and the high availability of external sources of information, pose more threats and add extra challenges to the problem. In this survey, we visit the works done on data privacy for continuous data publishing, and report on the proposed solutions, with a special focus on solutions concerning location or geo-referenced data

    Computation and Consistent Estimation of Stationary Optimal Transport Plans

    Get PDF
    Informally, the optimal transport (OT) problem is to align, or couple, two distributions of interest as best as possible with respect to some prespecified cost. A coupling that achieves the minimum cost among all couplings is referred to as an OT plan; the cost of the OT plan is referred to as the OT cost. Researchers in statistics and machine learning have expended a great deal of effort to understand the properties of OT plans and costs. The motivation for this work stems partly from the fact that, unlike many other divergence measures and metrics between distributions, OT plans and costs describe relationships between distributions in a manner that respects the geometry of the underlying space (by way of the specified cost). However, this advantage does not necessarily carry over when standard OT techniques are applied to distributions with specific structure. In the case that the two distributions describe stationary stochastic processes, the OT problem may ignore the differences in the sequential dependence of either process. One must find a way to make the OT problem account for the stationary dependence of the marginal processes. In this thesis, we study OT for stationary processes, a field that we refer to as stationary optimal transport. Through example and theory, we argue that when applying OT to stationary processes, one should incorporate the stationarity into the problem directly -- constraining the set of allowed transport plans to those that are stationary themselves. In this way, we only consider transport plans that respect the dependence structure of the marginal processes. We study this constrained OT problem from statistical and computational perspectives, with an eye toward applications in machine learning and data science. In particular, we develop algorithms for computing stationary OT plans of Markov chains, extend these tools for Markov OT to the alignment and comparison of weighted graphs, and propose estimates of stationary OT plans based on finite sequences of observations. We build upon existing techniques in OT as well as draw from a variety of fields including Markov decision processes, graph theory, and ergodic theory. In doing this, we uncover new perspectives on OT and pave the way for additional applications and approaches in future work.Doctor of Philosoph

    Deep latent-variable models for neural text generation

    Get PDF
    Text generation aims to produce human-like natural language output for down-stream tasks. It covers a wide range of applications like machine translation, document summarization, dialogue generation and so on. Recently deep neural network-based end-to-end architectures are known to be data-hungry, and text generated from them usually suffer from low diversity, interpretability and controllability. As a result, it is difficult to trust the output from them in real-life applications. Deep latent-variable models, by specifying the probabilistic distribution over an intermediate latent process, provide a potential way of addressing these problems while maintaining the expressive power of deep neural networks. This presentation will explain how deep latent-variable models can improve over the standard encoder-decoder model for text generation. We will start from an introduction of encoder-decoder and deep latent-variable models, then go over popular optimization strategies, and finally elaborate on how latent variable models can help improve the diversity, interpretability and data efficiency in different applications of text generation tasks.Textgenerierung zielt darauf ab, eine menschenähnliche Textausgabe in natürlicher Sprache für Anwendungen zu erzeugen. Es deckt eine breite Palette von Anwendungen ab, wie maschinelle Übersetzung, Zusammenfassung von Dokumenten, Generierung von Dialogen usw. In letzter Zeit werden dafür hauptsächlich Endto- End-Architekturen auf der Basis von tiefen neuronalen Netzwerken verwendet. Der End-to-End-Ansatz fasst alle Submodule, die früher nach komplexen handgefertigten Regeln entworfen wurden, zu einer ganzheitlichen Codierungs- Decodierungs-Architektur zusammen. Bei ausreichenden Trainingsdaten kann eine Leistung auf dem neuesten Stand der Technik erzielt werden, ohne dass sprach- und domänenabhängiges Wissen erforderlich ist. Deep-Learning-Modelle sind jedoch als extrem datenhungrig bekannt und daraus generierter Text leidet normalerweise unter geringer Diversität, Interpretierbarkeit und Kontrollierbarkeit. Infolgedessen ist es schwierig, der Ausgabe von ihnen in realen Anwendungen zu vertrauen. Tiefe Modelle mit latenten Variablen bieten durch Angabe der Wahrscheinlichkeitsverteilung über einen latenten Zwischenprozess eine potenzielle Möglichkeit, diese Probleme zu lösen und gleichzeitig die Ausdruckskraft tiefer neuronaler Netze zu erhalten. Diese Dissertation zeigt, wie tiefe Modelle mit latenten Variablen Texterzeugung verbessern gegenüber dem üblichen Encoder-Decoder-Modell. Wir beginnen mit einer Einführung in Encoder-Decoder- und Deep Latent Variable-Modelle und gehen dann auf gängige Optimierungsstrategien wie Variationsinferenz, dynamische Programmierung, Soft Relaxation und Reinforcement Learning ein. Danach präsentieren wir Folgendes: 1. Wie latente Variablen Vielfalt der Texterzeugung verbessern können, indem ganzheitliche, latente Darstellungen auf Satzebene gelernt werden. Auf diese Weise kann zunächst eine latente Darstellung ausgewählt werden, aus der verschiedene Texte generiert werden können. Wir präsentieren effektive Algorithmen, um gleichzeitig das Lernen der Repräsentation und die Texterzeugung durch Variationsinferenz zu trainieren. Um die Einschränkungen der Variationsinferenz bezüglich Uni-Modalität und Inkonsistenz anzugehen, schlagen wir eine Wake-Sleep-Variation und ein auf Transinformation basierendes Trainingsziel vor. Experimente zeigen, dass sie sowohl die übliche Variationsinferenz als auch nicht-latente Variablenmodelle bei der Dialoggenerierung übertreffen. 2. Wie latente Variablen die Steuerbarkeit und Interpretierbarkeit der Texterzeugung verbessern können, indem feinkörnigere latente Spezifikationen zum Zwischengenerierungsprozess hinzugefügt werden. Wir veranschaulichen die Verwendung latenter Variablen für Wortausrichtung, Inhaltsauswahl, Textsegmentierung und Feldsegmentkorrespondenz. Wir leiten für sie effiziente Trainingsalgorithmen ab, damit die Texterzeugung explizit gesteuert werden kann, indem die latente Variable, die durch ihre Definition vom Menschen interpretiert werden kann, manipuliert wird. 3. Überwindung der Seltenheit von Trainingsmustern durch Behandlung von nicht parallelem Text als latente Variablen. Das Training kann wie beim Standard-EM-Algorithmus durchgeführt werden, der stabil konvergiert. Wir zeigen, dass es bei der Dialoggenerierung erfolgreich angewendet werden kann und den Generierungsraum durch die Verwendung von nicht-konversativem Text erheblich bereichert
    corecore