147 research outputs found
APPROXIMATION ALGORITHMS FOR POINT PATTERN MATCHING AND SEARCHI NG
Point pattern matching is a fundamental problem in computational geometry.
For given a reference set and pattern set, the problem is to find a
geometric transformation applied to the pattern set that minimizes some
given distance measure with respect to the reference set. This problem has
been heavily researched under various distance measures and error models.
Point set similarity searching is variation of this problem in which a
large database of point sets is given, and the task is to preprocess
this database into a data structure so that, given a query point set,
it is possible to rapidly find the nearest point set among elements of
the database. Here, the term nearest is understood in
above sense of pattern matching, where the elements of the database may be
transformed to match the given query set. The approach presented here is
to compute a low distortion embedding of the pattern matching problem into
an (ideally) low dimensional metric space and then apply any standard
algorithm for nearest neighbor searching over this metric space.
This main focus of this dissertation is on two problems
in the area of point pattern matching and searching algorithms:
(i) improving the accuracy of alignment-based point pattern matching and
(ii) computing low-distortion embeddings of point sets into vector spaces.
For the first problem, new methods are presented for matching point sets
based on alignments of small subsets of points. It is shown that these methods
lead to better approximation bounds for alignment-based planar point pattern
matching algorithms under the Hausdorff distance. Furthermore, it is shown
that these approximation bounds are nearly the best achievable by alignment-based
methods.
For the second problem, results are presented for two different distance
measures. First, point pattern similarity search under translation for point sets
in multidimensional integer space is considered, where the distance function is
the symmetric difference. A randomized embedding into real space under the L1
metric is given. The algorithm achieves an expected distortion of O(log2 n).
Second, an algorithm is given for embedding Rd under the Earth Mover's
Distance (EMD) into multidimensional integer space under the symmetric difference
distance. This embedding achieves a distortion of O(log D), where D is
the diameter of the point set. Combining this with the above result implies that
point pattern similarity search with translation under the EMD can be embedded in
to
real space in the L1 metric with an expected distortion of O(log2 n log D)
Automated pattern analysis in gesture research : similarity measuring in 3D motion capture models of communicative action
The question of how to model similarity between gestures plays an important role in current studies in the domain of human communication. Most research into recurrent patterns in co-verbal gestures – manual communicative movements emerging spontaneously during conversation – is driven by qualitative analyses relying on observational comparisons between gestures. Due to the fact that these kinds of gestures are not bound to well-formedness conditions, however, we propose a quantitative approach consisting of a distance-based similarity model for gestures recorded and represented in motion capture data streams. To this end, we model gestures by flexible feature representations, namely gesture signatures, which are then compared via signature-based distance functions such as the Earth Mover's Distance and the Signature Quadratic Form Distance. Experiments on real conversational motion capture data evidence the appropriateness of the proposed approaches in terms of their accuracy and efficiency. Our contribution to gesture similarity research and gesture data analysis allows for new quantitative methods of identifying patterns of gestural movements in human face-to-face interaction, i.e., in complex multimodal data sets
Automated pattern analysis in gesture research : similarity measuring in 3D motion capture models of communicative action
The question of how to model similarity between gestures plays an important role in current studies in the domain of human communication. Most research into recurrent patterns in co-verbal gestures – manual communicative movements emerging spontaneously during conversation – is driven by qualitative analyses relying on observational comparisons between gestures. Due to the fact that these kinds of gestures are not bound to well-formedness conditions, however, we propose a quantitative approach consisting of a distance-based similarity model for gestures recorded and represented in motion capture data streams. To this end, we model gestures by flexible feature representations, namely gesture signatures, which are then compared via signature-based distance functions such as the Earth Mover's Distance and the Signature Quadratic Form Distance. Experiments on real conversational motion capture data evidence the appropriateness of the proposed approaches in terms of their accuracy and efficiency. Our contribution to gesture similarity research and gesture data analysis allows for new quantitative methods of identifying patterns of gestural movements in human face-to-face interaction, i.e., in complex multimodal data sets
Recommended from our members
Algorithms for multi-modal human movement and behaviour monitoring
This thesis describes investigations into improvements in the field of automated people tracking using multi-modal infrared (IR) and visible image information. The research question posed is; “To what extent can infrared image information be used to improve visible light based human tracking systems?” Automated passive tracking of human subjects is an active research area which has been approached in many ways. Typical approaches include the segmentation of the foreground, the location of humans, model initialisation and subject tracking. Sensor reliability evaluation and fusion methods are also key research areas in multi-modal systems. Shifting illumination and shadows can cause issues with visible images when attempting to extract foreground regions. Images from thermal IR cameras, which use long-wavelength infrared (LWIR) sensors, demonstrate high invariance to illumination. It is shown that thermal IR images often provide superior foreground masks using pixel level statistical extraction techniques in many scenarios. Experiments are performed to determine if cues are present at the data level that may indicate the quality of the sensor as an input. Modality specific measures are proposed as possible indicators of sensor quality (determined by foreground extraction capability). A sensor and application specific method for scene evaluation is proposed, whereby sensor quality is measured at the pixel level. A neuro-fuzzy inference system is trained using the scene quality measures to assess a series of scenes and make a modality decision
Sublinear algorithms for Earth Mover's Distance
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Includes bibliographical references (p. 14-15).We study the problem of estimating the Earth Mover's Distance (EMD) between probability distributions when given access only to samples. We give closeness testers and additive-error estimators over domains in [0, [delta]]d, with sample complexities independent of domain size - permitting the testability even of continuous distributions over infinite domains. Instead, our algorithms depend on other parameters, such as the diameter of the domain space, which may be significantly smaller. We also prove lower bounds showing our testers to be optimal in their dependence on these parameters. Additionally, we consider whether natural classes of distributions exist for which there are algorithms with better dependence on the dimension, and show that for highly clusterable data, this is indeed the case. Lastly, we consider a variant of the EMD, defined over tree metrics instead of the usual L₁ metric, and give optimal algorithms.by Khanh Do Ba.S.M
Model-based compressive sensing with Earth Mover's Distance constraints
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 71-72).In compressive sensing, we want to recover ... from linear measurements of the form ... describes the measurement process. Standard results in compressive sensing show that it is possible to exactly recover the signal x from only m ... measurements for certain types of matrices. Model-based compressive sensing reduces the number of measurements even further by limiting the supports of x to a subset of the ... possible supports. Such a family of supports is called a structured sparsity model. In this thesis, we introduce a structured sparsity model for two-dimensional signals that have similar support in neighboring columns. We quantify the change in support between neighboring columns with the Earth Mover's Distance (EMD), which measures both how many elements of the support change and how far the supported elements move. We prove that for a reasonable limit on the EMD between adjacent columns, we can recover signals in our model from only ... measurements, where w is the width of the signal. This is an asymptotic improvement over the ... bound in standard compressive sensing. While developing the algorithmic tools for our proposed structured sparsity model, we also extend the model-based compressed sensing framework. In order to use a structured sparsity model in compressive sensing, we need a model projection algorithm that, given an arbitrary signal x, returns the best approximation in the model. We relax this constraint and develop a variant of IHT, an existing sparse recovery algorithm, that works with approximate model projection algorithms.by Ludwig Schmidt.S.M
Piecewise Affine Registration of Biological Images for Volume Reconstruction
This manuscript tackles the reconstruction of 3D volumes via mono-modal registration of series of 2D biological images (histological sections, autoradiographs, cryosections, etc.). The process of acquiring these images typically induces composite transformations that we model as a number of rigid or affine local transformations embedded in an elastic one. We propose a registration approach closely derived from this model. Given a pair of input images, we first compute a dense similarity field between them with a block matching algorithm. We use as a similarity measure an extension of the classical correlation coefficient that improves the consistency of the field. A hierarchical clustering algorithm then automatically partitions the field into a number of classes from which we extract independent pairs of sub-images. Our clustering algorithm relies on the Earth mover’s distribution metric and is additionally guided by robust least-square estimation of the transformations associated with each cluster. Finally, the pairs of sub-images are, independently, affinely registered and a hybrid affine/non-linear interpolation scheme is used to compose the output registered image. We investigate the behavior of our approach on several batches of histological data and discuss its sensitivity to parameters and noise
Learning and inference with Wasserstein metrics
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 131-143).This thesis develops new approaches for three problems in machine learning, using tools from the study of optimal transport (or Wasserstein) distances between probability distributions. Optimal transport distances capture an intuitive notion of similarity between distributions, by incorporating the underlying geometry of the domain of the distributions. Despite their intuitive appeal, optimal transport distances are often difficult to apply in practice, as computing them requires solving a costly optimization problem. In each setting studied here, we describe a numerical method that overcomes this computational bottleneck and enables scaling to real data. In the first part, we consider the problem of multi-output learning in the presence of a metric on the output domain. We develop a loss function that measures the Wasserstein distance between the prediction and ground truth, and describe an efficient learning algorithm based on entropic regularization of the optimal transport problem. We additionally propose a novel extension of the Wasserstein distance from probability measures to unnormalized measures, which is applicable in settings where the ground truth is not naturally expressed as a probability distribution. We show statistical learning bounds for both the Wasserstein loss and its unnormalized counterpart. The Wasserstein loss can encourage smoothness of the predictions with respect to a chosen metric on the output space. We demonstrate this property on a real-data image tagging problem, outperforming a baseline that doesn't use the metric. In the second part, we consider the probabilistic inference problem for diffusion processes. Such processes model a variety of stochastic phenomena and appear often in continuous-time state space models. Exact inference for diffusion processes is generally intractable. In this work, we describe a novel approximate inference method, which is based on a characterization of the diffusion as following a gradient flow in a space of probability densities endowed with a Wasserstein metric. Existing methods for computing this Wasserstein gradient flow rely on discretizing the underlying domain of the diffusion, prohibiting their application to problems in more than several dimensions. In the current work, we propose a novel algorithm for computing a Wasserstein gradient flow that operates directly in a space of continuous functions, free of any underlying mesh. We apply our approximate gradient flow to the problem of filtering a diffusion, showing superior performance where standard filters struggle. Finally, we study the ecological inference problem, which is that of reasoning from aggregate measurements of a population to inferences about the individual behaviors of its members. This problem arises often when dealing with data from economics and political sciences, such as when attempting to infer the demographic breakdown of votes for each political party, given only the aggregate demographic and vote counts separately. Ecological inference is generally ill-posed, and requires prior information to distinguish a unique solution. We propose a novel, general framework for ecological inference that allows for a variety of priors and enables efficient computation of the most probable solution. Unlike previous methods, which rely on Monte Carlo estimates of the posterior, our inference procedure uses an efficient fixed point iteration that is linearly convergent. Given suitable prior information, our method can achieve more accurate inferences than existing methods. We additionally explore a sampling algorithm for estimating credible regions.by Charles Frogner.Ph. D
- …