170 research outputs found

    Sublinear algorithms for Earth Mover's Distance

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Includes bibliographical references (p. 14-15).We study the problem of estimating the Earth Mover's Distance (EMD) between probability distributions when given access only to samples. We give closeness testers and additive-error estimators over domains in [0, [delta]]d, with sample complexities independent of domain size - permitting the testability even of continuous distributions over infinite domains. Instead, our algorithms depend on other parameters, such as the diameter of the domain space, which may be significantly smaller. We also prove lower bounds showing our testers to be optimal in their dependence on these parameters. Additionally, we consider whether natural classes of distributions exist for which there are algorithms with better dependence on the dimension, and show that for highly clusterable data, this is indeed the case. Lastly, we consider a variant of the EMD, defined over tree metrics instead of the usual L₁ metric, and give optimal algorithms.by Khanh Do Ba.S.M

    Average Sensitivity of Graph Algorithms

    Full text link
    In modern applications of graphs algorithms, where the graphs of interest are large and dynamic, it is unrealistic to assume that an input representation contains the full information of a graph being studied. Hence, it is desirable to use algorithms that, even when only a (large) subgraph is available, output solutions that are close to the solutions output when the whole graph is available. We formalize this idea by introducing the notion of average sensitivity of graph algorithms, which is the average earth mover's distance between the output distributions of an algorithm on a graph and its subgraph obtained by removing an edge, where the average is over the edges removed and the distance between two outputs is the Hamming distance. In this work, we initiate a systematic study of average sensitivity. After deriving basic properties of average sensitivity such as composition, we provide efficient approximation algorithms with low average sensitivities for concrete graph problems, including the minimum spanning forest problem, the global minimum cut problem, the minimum ss-tt cut problem, and the maximum matching problem. In addition, we prove that the average sensitivity of our global minimum cut algorithm is almost optimal, by showing a nearly matching lower bound. We also show that every algorithm for the 2-coloring problem has average sensitivity linear in the number of vertices. One of the main ideas involved in designing our algorithms with low average sensitivity is the following fact; if the presence of a vertex or an edge in the solution output by an algorithm can be decided locally, then the algorithm has a low average sensitivity, allowing us to reuse the analyses of known sublinear-time algorithms and local computation algorithms (LCAs). Using this connection, we show that every LCA for 2-coloring has linear query complexity, thereby answering an open question.Comment: 39 pages, 1 figur

    Automated pattern analysis in gesture research : similarity measuring in 3D motion capture models of communicative action

    Get PDF
    The question of how to model similarity between gestures plays an important role in current studies in the domain of human communication. Most research into recurrent patterns in co-verbal gestures – manual communicative movements emerging spontaneously during conversation – is driven by qualitative analyses relying on observational comparisons between gestures. Due to the fact that these kinds of gestures are not bound to well-formedness conditions, however, we propose a quantitative approach consisting of a distance-based similarity model for gestures recorded and represented in motion capture data streams. To this end, we model gestures by flexible feature representations, namely gesture signatures, which are then compared via signature-based distance functions such as the Earth Mover's Distance and the Signature Quadratic Form Distance. Experiments on real conversational motion capture data evidence the appropriateness of the proposed approaches in terms of their accuracy and efficiency. Our contribution to gesture similarity research and gesture data analysis allows for new quantitative methods of identifying patterns of gestural movements in human face-to-face interaction, i.e., in complex multimodal data sets

    Automated pattern analysis in gesture research : similarity measuring in 3D motion capture models of communicative action

    Get PDF
    The question of how to model similarity between gestures plays an important role in current studies in the domain of human communication. Most research into recurrent patterns in co-verbal gestures – manual communicative movements emerging spontaneously during conversation – is driven by qualitative analyses relying on observational comparisons between gestures. Due to the fact that these kinds of gestures are not bound to well-formedness conditions, however, we propose a quantitative approach consisting of a distance-based similarity model for gestures recorded and represented in motion capture data streams. To this end, we model gestures by flexible feature representations, namely gesture signatures, which are then compared via signature-based distance functions such as the Earth Mover's Distance and the Signature Quadratic Form Distance. Experiments on real conversational motion capture data evidence the appropriateness of the proposed approaches in terms of their accuracy and efficiency. Our contribution to gesture similarity research and gesture data analysis allows for new quantitative methods of identifying patterns of gestural movements in human face-to-face interaction, i.e., in complex multimodal data sets

    Understanding mass transfer directions via data-driven models with application to mobile phone data

    Get PDF
    The aim of this paper is to solve an inverse problem which regards a mass moving in a bounded domain. We assume that the mass moves following an unknown velocity field and that the evolution of the mass density can be described by a partial differential equation, which is also unknown. The input data of the problems are given by some snapshots of the mass distribution at certain times, while the sought output is the velocity field that drives the mass along its displacement. To this aim, we put in place an algorithm based on the combination of two methods: first, we use the dynamic mode decomposition to create a mathematical model describing the mass transfer; second, we use the notion of Wasserstein distance (also known as earth mover's distance) to reconstruct the underlying velocity field that is responsible for the displacement. Finally, we consider a real-life application: the algorithm is employed to study the travel flows of people in large populated areas using, as input data, density profiles (i.e., the spatial distribution) of people in given areas at different time instants. These kinds of data are provided by the Italian telecommunication company TIM and are derived by mobile phone usage

    Model-based compressive sensing with Earth Mover's Distance constraints

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 71-72).In compressive sensing, we want to recover ... from linear measurements of the form ... describes the measurement process. Standard results in compressive sensing show that it is possible to exactly recover the signal x from only m ... measurements for certain types of matrices. Model-based compressive sensing reduces the number of measurements even further by limiting the supports of x to a subset of the ... possible supports. Such a family of supports is called a structured sparsity model. In this thesis, we introduce a structured sparsity model for two-dimensional signals that have similar support in neighboring columns. We quantify the change in support between neighboring columns with the Earth Mover's Distance (EMD), which measures both how many elements of the support change and how far the supported elements move. We prove that for a reasonable limit on the EMD between adjacent columns, we can recover signals in our model from only ... measurements, where w is the width of the signal. This is an asymptotic improvement over the ... bound in standard compressive sensing. While developing the algorithmic tools for our proposed structured sparsity model, we also extend the model-based compressed sensing framework. In order to use a structured sparsity model in compressive sensing, we need a model projection algorithm that, given an arbitrary signal x, returns the best approximation in the model. We relax this constraint and develop a variant of IHT, an existing sparse recovery algorithm, that works with approximate model projection algorithms.by Ludwig Schmidt.S.M
    corecore