25 research outputs found

    Learning and inference with Wasserstein metrics

    Get PDF
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Brain and Cognitive Sciences, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 131-143).This thesis develops new approaches for three problems in machine learning, using tools from the study of optimal transport (or Wasserstein) distances between probability distributions. Optimal transport distances capture an intuitive notion of similarity between distributions, by incorporating the underlying geometry of the domain of the distributions. Despite their intuitive appeal, optimal transport distances are often difficult to apply in practice, as computing them requires solving a costly optimization problem. In each setting studied here, we describe a numerical method that overcomes this computational bottleneck and enables scaling to real data. In the first part, we consider the problem of multi-output learning in the presence of a metric on the output domain. We develop a loss function that measures the Wasserstein distance between the prediction and ground truth, and describe an efficient learning algorithm based on entropic regularization of the optimal transport problem. We additionally propose a novel extension of the Wasserstein distance from probability measures to unnormalized measures, which is applicable in settings where the ground truth is not naturally expressed as a probability distribution. We show statistical learning bounds for both the Wasserstein loss and its unnormalized counterpart. The Wasserstein loss can encourage smoothness of the predictions with respect to a chosen metric on the output space. We demonstrate this property on a real-data image tagging problem, outperforming a baseline that doesn't use the metric. In the second part, we consider the probabilistic inference problem for diffusion processes. Such processes model a variety of stochastic phenomena and appear often in continuous-time state space models. Exact inference for diffusion processes is generally intractable. In this work, we describe a novel approximate inference method, which is based on a characterization of the diffusion as following a gradient flow in a space of probability densities endowed with a Wasserstein metric. Existing methods for computing this Wasserstein gradient flow rely on discretizing the underlying domain of the diffusion, prohibiting their application to problems in more than several dimensions. In the current work, we propose a novel algorithm for computing a Wasserstein gradient flow that operates directly in a space of continuous functions, free of any underlying mesh. We apply our approximate gradient flow to the problem of filtering a diffusion, showing superior performance where standard filters struggle. Finally, we study the ecological inference problem, which is that of reasoning from aggregate measurements of a population to inferences about the individual behaviors of its members. This problem arises often when dealing with data from economics and political sciences, such as when attempting to infer the demographic breakdown of votes for each political party, given only the aggregate demographic and vote counts separately. Ecological inference is generally ill-posed, and requires prior information to distinguish a unique solution. We propose a novel, general framework for ecological inference that allows for a variety of priors and enables efficient computation of the most probable solution. Unlike previous methods, which rely on Monte Carlo estimates of the posterior, our inference procedure uses an efficient fixed point iteration that is linearly convergent. Given suitable prior information, our method can achieve more accurate inferences than existing methods. We additionally explore a sampling algorithm for estimating credible regions.by Charles Frogner.Ph. D

    Vision based localization of mobile robots

    Get PDF
    Mobile robotics is an active and exciting sub-field of Computer Science. Its importance is easily witnessed in a variety of undertakings from DARPA\u27s Grand Challenge to NASA\u27s Mars exploration program. The field is relatively young, and still many challenges face roboticists across the board. One important area of research is localization, which concerns itself with granting a robot the ability to discover and continually update an internal representation of its position. Vision based sensor systems have been investigated [8,22,27], but to much lesser extent than other popular techniques [4,6,7,9,10]. A custom mobile platform has been constructed on top of which a monocular vision based localization system has been implemented. The rigorous gathering of empirical data across a large group of parameters germane to the problem has led to various findings about monocular vision based localization and the fitness of the custom robot platform. The localization component is based on a probabilistic technique called Monte-Carlo Localization (MCL) that tolerates a variety of different sensors and effectors, and has further proven to be adept at localization in diverse circumstances. Both a motion model and sensor model that drive the particle filter at the algorithm\u27s core have been carefully derived. The sensor model employs a simple correlation process that leverages color histograms and edge detection to filter robot pose estimations via the on board vision. This algorithm relies on image matching to tune position estimates based on a priori knowledge of its environment in the form of a feature library. It is believed that leveraging different computationally inexpensive features can lead to efficient and robust localization with MCL. The central goal of this thesis is to implement and arrive at such a conclusion through the gathering of empirical data. Section 1 presents a brief introduction to mobile robot localization and robot architectures, while section 2 covers MCL itself in more depth. Section 3 elaborates on the localization strategy, modeling and implementation that forms the basis of the trials that are presented toward the end of that section. Section 4 presents a revised implementation that attempts to address shortcomings identified during localization trials. Finally in section 5, conclusions are drawn about the effectiveness of the localization implementation and a path to improved localization with monocular vision is posited

    Automation and robotics human performance

    Get PDF
    The scope of this report is limited to the following: (1) assessing the feasibility of the assumptions for crew productivity during the intra-vehicular activities and extra-vehicular activities; (2) estimating the appropriate level of automation and robotics to accomplish balanced man-machine, cost-effective operations in space; (3) identifying areas where conceptually different approaches to the use of people and machines can leverage the benefits of the scenarios; and (4) recommending modifications to scenarios or developing new scenarios that will improve the expected benefits. The FY89 special assessments are grouped into the five categories shown in the report. The high level system analyses for Automation & Robotics (A&R) and Human Performance (HP) were performed under the Case Studies Technology Assessment category, whereas the detailed analyses for the critical systems and high leverage development areas were performed under the appropriate operations categories (In-Space Vehicle Operations or Planetary Surface Operations). The analysis activities planned for the Science Operations technology areas were deferred to FY90 studies. The remaining activities such as analytic tool development, graphics/video demonstrations and intelligent communicating systems software architecture were performed under the Simulation & Validations category

    Morphologie, Géométrie et Statistiques en imagerie non-standard

    Get PDF
    Digital image processing has followed the evolution of electronic and computer science. It is now current to deal with images valued not in {0,1} or in gray-scale, but in manifolds or probability distributions. This is for instance the case for color images or in diffusion tensor imaging (DTI). Each kind of images has its own algebraic, topological and geometric properties. Thus, existing image processing techniques have to be adapted when applied to new imaging modalities. When dealing with new kind of value spaces, former operators can rarely be used as they are. Even if the underlying notion has still a meaning, a work must be carried out in order to express it in the new context.The thesis is composed of two independent parts. The first one, "Mathematical morphology on non-standard images", concerns the extension of mathematical morphology to specific cases where the value space of the image does not have a canonical order structure. Chapter 2 formalizes and demonstrates the irregularity issue of total orders in metric spaces. The main results states that for any total order in a multidimensional vector space, there are images for which the morphological dilations and erosions are irregular and inconsistent. Chapter 3 is an attempt to generalize morphology to images valued in a set of unordered labels.The second part "Probability density estimation on Riemannian spaces" concerns the adaptation of standard density estimation techniques to specific Riemannian manifolds. Chapter 5 is a work on color image histograms under perceptual metrics. The main idea of this chapter consists in computing histograms using local Euclidean approximations of the perceptual metric, and not a global Euclidean approximation as in standard perceptual color spaces. Chapter 6 addresses the problem of non parametric density estimation when data lay in spaces of Gaussian laws. Different techniques are studied, an expression of kernels is provided for the Wasserstein metric.Le traitement d'images numériques a suivi l'évolution de l'électronique et de l'informatique. Il est maintenant courant de manipuler des images à valeur non pas dans {0,1}, mais dans des variétés ou des distributions de probabilités. C'est le cas par exemple des images couleurs où de l'imagerie du tenseur de diffusion (DTI). Chaque type d'image possède ses propres structures algébriques, topologiques et géométriques. Ainsi, les techniques existantes de traitement d'image doivent être adaptés lorsqu'elles sont appliquées à de nouvelles modalités d'imagerie. Lorsque l'on manipule de nouveaux types d'espaces de valeurs, les précédents opérateurs peuvent rarement être utilisés tel quel. Même si les notions sous-jacentes ont encore un sens, un travail doit être mené afin de les exprimer dans le nouveau contexte. Cette thèse est composée de deux parties indépendantes. La première, « Morphologie mathématiques pour les images non standards », concerne l'extension de la morphologie mathématique à des cas particuliers où l'espace des valeurs de l'image ne possède pas de structure d'ordre canonique. Le chapitre 2 formalise et démontre le problème de l'irrégularité des ordres totaux dans les espaces métriques. Le résultat principal de ce chapitre montre qu'étant donné un ordre total dans un espace vectoriel multidimensionnel, il existe toujours des images à valeur dans cet espace tel que les dilatations et les érosions morphologiques soient irrégulières et incohérentes. Le chapitre 3 est une tentative d'extension de la morphologie mathématique aux images à valeur dans un ensemble de labels non ordonnés.La deuxième partie de la thèse, « Estimation de densités de probabilités dans les espaces de Riemann » concerne l'adaptation des techniques classiques d'estimation de densités non paramétriques à certaines variétés Riemanniennes. Le chapitre 5 est un travail sur les histogrammes d'images couleurs dans le cadre de métriques perceptuelles. L'idée principale de ce chapitre consiste à calculer les histogrammes suivant une approximation euclidienne local de la métrique perceptuelle, et non une approximation globale comme dans les espaces perceptuels standards. Le chapitre 6 est une étude sur l'estimation de densité lorsque les données sont des lois Gaussiennes. Différentes techniques y sont analysées. Le résultat principal est l'expression de noyaux pour la métrique de Wasserstein

    Dissimilarity-based learning for complex data

    Get PDF
    Mokbel B. Dissimilarity-based learning for complex data. Bielefeld: Universität Bielefeld; 2016.Rapid advances of information technology have entailed an ever increasing amount of digital data, which raises the demand for powerful data mining and machine learning tools. Due to modern methods for gathering, preprocessing, and storing information, the collected data become more and more complex: a simple vectorial representation, and comparison in terms of the Euclidean distance is often no longer appropriate to capture relevant aspects in the data. Instead, problem-adapted similarity or dissimilarity measures refer directly to the given encoding scheme, allowing to treat information constituents in a relational manner. This thesis addresses several challenges of complex data sets and their representation in the context of machine learning. The goal is to investigate possible remedies, and propose corresponding improvements of established methods, accompanied by examples from various application domains. The main scientific contributions are the following: (I) Many well-established machine learning techniques are restricted to vectorial input data only. Therefore, we propose the extension of two popular prototype-based clustering and classification algorithms to non-negative symmetric dissimilarity matrices. (II) Some dissimilarity measures incorporate a fine-grained parameterization, which allows to configure the comparison scheme with respect to the given data and the problem at hand. However, finding adequate parameters can be hard or even impossible for human users, due to the intricate effects of parameter changes and the lack of detailed prior knowledge. Therefore, we propose to integrate a metric learning scheme into a dissimilarity-based classifier, which can automatically adapt the parameters of a sequence alignment measure according to the given classification task. (III) A valuable instrument to make complex data sets accessible are dimensionality reduction techniques, which can provide an approximate low-dimensional embedding of the given data set, and, as a special case, a planar map to visualize the data's neighborhood structure. To assess the reliability of such an embedding, we propose the extension of a well-known quality measure to enable a fine-grained, tractable quantitative analysis, which can be integrated into a visualization. This tool can also help to compare different dissimilarity measures (and parameter settings), if ground truth is not available. (IV) All techniques are demonstrated on real-world examples from a variety of application domains, including bioinformatics, motion capturing, music, and education

    Connected Attribute Filtering Based on Contour Smoothness

    Get PDF

    On the automatic detection of otolith features for fish species identification and their age estimation

    Get PDF
    This thesis deals with the automatic detection of features in signals, either extracted from photographs or captured by means of electronic sensors, and its possible application in the detection of morphological structures in fish otoliths so as to identify species and estimate their age at death. From a more biological perspective, otoliths, which are calcified structures located in the auditory system of all teleostean fish, constitute one of the main elements employed in the study and management of marine ecology. In this sense, the application of Fourier descriptors to otolith images, combined with component analysis, is habitually a first and a key step towards characterizing their morphology and identifying fish species. However, some of the main limitations arise from the poor interpretation that can be obtained with this representation and the use that is made of the coefficients, as generally they are selected manually for classification purposes, both in quantity and representativity. The automatic detection of irregularities in signals, and their interpretation, was first addressed in the so-called Best-Basis paradigm. In this sense, Saito's Local discriminant Bases algorithm (LDB) uses the Discrete Wavelet Packet Transform (DWPT) as the main descriptive tool for positioning the irregularities in the time-frequency space, and an energy-based discriminant measure to guide the automatic search of relevant features in this domain. Current density-based proposals have tried to overcome the limitations of the energy-based functions with relatively little success. However, other measure strategies more consistent with the true classification capability, and which can provide generalization while reducing the dimensionality of features, are yet to be developed. The proposal of this work focuses on a new framework for one-dimensional signals. An important conclusion extracted therein is that such generalization involves a mesure system of bounded values representing the density where no class overlaps. This determines severely the selection of features and the vector size that is needed for proper class identification, which must be implemented not only based on global discriminant values but also on the complementary information regarding the provision of samples in the domain. The new tools have been used in the biological study of different hake species, yielding good classification results. However, a major contribution lies on the further interpretation of features the tool performs, including the structure of irregularities, time-frequency position, extension support and degree of importance, which is highlighted automatically on the same images or signals. As for aging applications, a new demodulation strategy for compensating the nonlinear growth effect on the intensity profile has been developed. Although the method is, in principle, able to adapt automatically to the specific growth of individual specimens, preliminary results with LDB-based techniques suggest to study the effect of lighting conditions on the otoliths in order to design more reliable techniques for reducing image contrast variation. In the meantime, a new theoretic framework for otolith-based fish age estimation has been presented. This theory suggests that if the true fish growth curve is known, the regular periodicity of age structures in the demodulated profile is related to the radial length the original intensity profile is extracted from. Therefore, if this periodicity can be measured, it is possible to infer the exact fish age omitting feature extractors and classifiers. This could have important implications in the use of computational resources anc current aging approaches.El eje principal de esta tesis trata sobre la detección automática de singularidades en señales, tanto si se extraen de imágenes fotográ cas como si se capturan de sensores electrónicos, así como su posible aplicación en la detección de estructuras morfológicas en otolitos de peces para identi car especies, y realizar una estimación de la edad en el momento de su muerte. Desde una vertiente más biológica, los otolitos, que son estructuras calcáreas alojadas en el sistema auditivo de todos los peces teleósteos, constituyen uno de los elementos principales en el estudio y la gestión de la ecología marina. En este sentido, el uso combinado de descriptores de Fourier y el análisis de componentes es el primer paso y la clave para caracterizar su morfología e identi car especies marinas. Sin embargo, una de las limitaciones principales de este sistema de representación subyace en la interpretación limitada que se puede obtener de las irregularidades, así como el uso que se hace de los coe cientes en tareas de clasi cación que, por lo general, acostumbra a seleccionarse manualmente tanto por lo que respecta a la cantidad y a su importancia. La detección automática de irregularidades en señales, y su interpretación, se abordó por primera bajo el marco del Best-Basis paradigm. En este sentido, el algoritmo Local Discriminant Bases (LDB) de N. Saito utiliza la Transformada Wavelet Discreta (DWT) para describir el posicionamiento de características en el espacio tiempo-frecuencia, y una medida discriminante basada en la energía para guiar la búsqueda automática de características en dicho dominio. Propuestas recientes basadas en funciones de densidad han tratado de superar las limitaciones que presentaban las medidas de energía con un éxito relativo. No obstante, todavía están por desarrollar nuevas estrategias más consistentes con la capacidad real de clasi cación y que ofrezcan mayor generalización al reducir la dimensión de los datos de entrada. La propuesta de este trabajo se centra en un nuevo marco para señales unidimensionales. Una conclusión principal que se extrae es que dicha generalización pasa por un marco de medidas de valores acotados que re ejen la densidad donde las clases no se solapan. Esto condiciona severamente el proceso de selección de características y el tamaño del vector necesario para identi car las clases correctamente, que se ha de establecer no sólo en base a valores discriminantes globales sino también en la información complementaria sobre la disposición de las muestras en el dominio. Las nuevas herramientas han sido utilizadas en el estudio biológico de diferentes especies de merluza, donde se han conseguido buenos resultados de identi cación. No obstante, la contribución principal subyace en la interpretación que dicha herramienta hace de las características seleccionadas, y que incluye la estructura de las irregularidades, su posición temporal-frecuencial, extensión en el eje y grado de relevancia, el cual, se resalta automáticamente sobre la misma imagen o señal. Por lo que respecta a la determinación de la edad, se ha planteado una nueva estrategia de demodulación para compensar el efecto del crecimiento no lineal en los per les de intensidad. Inicialmente, aunque el método implementa un proceso de optimización capaz de adaptarse al crecimiento individual de cada pez automáticamente, resultados preliminares obtenidos con técnicas basadas en el LDB sugieren estudiar el efecto de las condiciones lumínicas sobre los otolitos con el n de diseñar algoritmos que reduzcan la variación del contraste de la imagen más ablemente. Mientras tanto, se ha planteado una nueva teoría para estimar la edad de los peces en base a otolitos. Esta teoría sugiere que si la curva de crecimiento real del pez se conoce, el período regular de los anillos en el per l demodulado está relacionado con la longitud total del radio donde se extrae el per l original. Por tanto, si dicha periodicidad es medible, es posible determinar la edad exacta sin necesidad de utilizar extractores de características o clasi cadores, lo cual tendría implicaciones importantes en el uso de recursos computacionales y en las técnicas actuales de estimación de la edad.L'eix principal d'aquesta tesi tracta sobre la detecció automàtica d'irregularitats en senyals, tant si s'extreuen de les imatges fotogrà ques com si es capturen de sensors electrònics, així com la seva possible aplicació en la detecció d'estructures morfològiques en otòlits de peixos per identi car espècies, i realitzar una estimació de l'edat en el moment de la seva mort. Des de la vesant més biològica, els otòlits, que son estructures calcàries que es troben en el sistema auditiu de tots els peixos teleostis, constitueixen un dels elements principals en l'estudi i la gestió de l'ecologia marina. En aquest sentit, l'ús combinat de descriptors de Fourier i l'anàlisi de components es el primer pas i la clau per caracteritzar la seva morfologia i identi car espècies marines. No obstant, una de les limitacions principals d'aquest sistema de representació consisteix en la interpretació limitada de les irregularitats que pot desenvolupar, així com l'ús que es realitza dels coe cients en tasques de classi cació, els quals, acostumen a ser seleccionats manualment tant pel que respecta a la quantitat com la seva importància. La detecció automàtica d'irregularitats en senyals, així com la seva interpretació, es va tractar per primera vegada sota el marc del Best-Basis paradigm. En aquest sentit, l'algorisme Local Discriminant Bases (LDB) de N. Saito es basa en la Transformada Wavelet Discreta (DWT) per descriure el posicionament de característiques dintre de l'espai temporal-freqüencial, i en una mesura discriminant basada en l'energia per guiar la cerca automàtica de característiques dintre d'aquest domini. Propostes més recents basades en funcions de densitat han tractat de superar les limitacions de les mesures d'energia amb un èxit relatiu. No obstant, encara s'han de desenvolupar noves estratègies que siguin més consistents amb la capacitat real de classi cació i ofereixin més generalització al reduir la dimensió de les dades d'entrada. La proposta d'aquest treball es centra en un nou marc per senyals unidimensionals. Una de las conclusions principals que s'extreu es que aquesta generalització passa per establir un marc de mesures acotades on els valors re ecteixin la densitat on cap classe es solapa. Això condiciona bastant el procés de selecció de característiques i la mida del vector necessari per identi car les classes correctament, que s'han d'establir no només en base a valors discriminants globals si no també en informació complementària sobre la disposició de les mostres en el domini. Les noves eines s'han utilitzat en diferents estudis d'espècies de lluç, on s'han obtingut bons resultats d'identi cació. No obstant, l'aportació principal consisteix en la interpretació que l'eina extreu de les característiques seleccionades, i que inclou l'estructura de les irregularitats, la seva posició temporal-freqüencial, extensió en l'eix i grau de rellevància, el qual, es ressalta automàticament sobre les mateixa imatge o senyal. En quan a l'àmbit de determinació de l'edat, s'ha plantejat una nova estratègia de demodulació de senyals per compensar l'efecte del creixement no lineal en els per ls d'intensitat. Tot i que inicialment aquesta tècnica desenvolupa un procés d'optimització capaç d'adaptar-se automàticament al creixement individual de cada peix, els resultats amb el LDB suggereixen estudiar l'efecte de les condicions lumíniques sobre els otòlits amb la nalitat de dissenyar algorismes que redueixin la variació del contrast de les imatges més ablement. Mentrestant s'ha plantejat una nova teoria per realitzar estimacions d'edat en peixos en base als otòlits. Aquesta teoria suggereix que si la corba de creixement és coneguda, el període regular dels anells en el per l d'intensitat demodulat està relacionat amb la longitud total de radi d'on s'agafa el per l original. Per tant, si la periodicitat es pot mesurar, es possible conèixer l'edat exacta del peix sense usar extractors de característiques o classi cadors, la qual cosa tindria implicacions importants en l'ús de recursos computacionals i en les tècniques actuals d'estimació de l'edat.Postprint (published version
    corecore