40 research outputs found

    Bayesian Nonparametric Inference of Switching Linear Dynamical Systems

    Get PDF
    Many complex dynamical phenomena can be effectively modeled by a system that switches among a set of conditionally linear dynamical modes. We consider two such models: the switching linear dynamical system (SLDS) and the switching vector autoregressive (VAR) process. Our Bayesian nonparametric approach utilizes a hierarchical Dirichlet process prior to learn an unknown number of persistent, smooth dynamical modes. We additionally employ automatic relevance determination to infer a sparse set of dynamic dependencies allowing us to learn SLDS with varying state dimension or switching VAR processes with varying autoregressive order. We develop a sampling algorithm that combines a truncated approximation to the Dirichlet process with efficient joint sampling of the mode and state sequences. The utility and flexibility of our model are demonstrated on synthetic data, sequences of dancing honey bees, the IBOVESPA stock index, and a maneuvering target tracking application.Comment: 50 pages, 7 figure

    Online, offline and transfer learning for decision-making

    Get PDF
    This thesis investigates online, offline and transfer learning for decision making tasks using a combination of behavioural experiments, computational modelling and Electroencephalography (EEG). Our experiments used a new set of decision-making tasks, in which the appropriate response depended on the linear or nonlinear combination of multiple stimulus features, and were developed to have better ecological validity than many previous tasks in the literature. The first study, in chapter 2, outlines the contextual settings in which representations of the environment can be learnt online. We manipulated the temporal structure of trials, and nature of stimulus-response mappings, and showed their effects on performance and declarative learning. We fitted a Latent Cause Model (LCM) of participants behaviour and derived measures that we used to gain insight into the representations formed. In chapter 3 we used EEG to identify the multiple successive stages of representation learning preceding decisions and following feedback. We used a Computational-EEG approach in which subject-specific LCM variables were used to predict a subject’s EEG data, and found evidence of feature representation in sensory regions and more complex representations in frontal regions. In chapter 4 we shifted the focus to offline learning, by examining the effect of a period of quiet wakefulness on performance in the same task. We found that quiet wakefulness significantly improved the generalization of previously learnt associations. Finally, in the last study, in chapter 5, we investigated how knowledge acquired in one task can be transferred to another. We borrowed the concept of shared subspaces from the multitask learning literature and showed that this provides a useful framework for the study of human transfer learning. Taken as a whole, the thesis shows how humans form representations online and offline, and how extracted knowledge can be transferred to new tasks

    Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures

    Full text link

    Sparse Representations and Feature Learning for Image Set Classification and Correspondence Estimation

    Get PDF
    The use of effective features is a key component in solving many computer vision tasks including, but not limited to, image (set) classification and correspondence estimation. Many research directions have focused on finding good features for the task under consideration, traditionally by hand crafting and recently by machine learning. In our work, we present algorithms for feature extraction and sparse representation for the classification of image sets. In addition, we present an approach for deep metric learning for correspondence estimation. We start by benchmarking various image set classification methods on a mobile video dataset that we have collected and made public. The videos were acquired under three different ambient conditions to capture the type of variations caused by the 'mobility' of the devices. An inspection of these videos reveals a combination of favorable and challenging properties unique to smartphone face videos. Besides mobility, the dataset has other challenges including partial faces, occasional pose changes, blur and fiducial point localization errors. Based on the evaluation, the recognition rates drop dramatically when enrollment and test videos come from different sessions. We then present Bayesian Representation-based Classification (BRC), an approach based on sparse Bayesian regression and subspace clustering for image set classification. A Bayesian statistical framework is used to compare BRC with similar existing approaches such as Collaborative Representation-based Classification (CRC) and Sparse Representation-based Classification (SRC), where it is shown that BRC employs precision hyperpriors that are more non-informative than those of CRC/SRC. Furthermore, we present a robust probe image set handling strategy that balances the trade-off between efficiency and accuracy. Experiments on three datasets illustrate the effectiveness of our algorithm compared to state-of-the-art set-based methods. We then propose to represent image sets as a dictionaries of hand-crafted descriptors based on Symmetric Positive Definite (SPD) matrices that are more robust to local deformations and fiducial point location errors. We then learn a tangent map for transforming the SPD matrix logarithms into a lower-dimensional Log-Euclidean space such that the transformed gallery atoms adhere to a more discriminative subspace structure. A query image set is then classified by first mapping its SPD descriptors into the computed Log-Euclidean tangent space and then using the sparse representation over the tangent space to decide a label for the image set. Experiments on four public datasets show that representation-based classification based on the proposed features outperforms many state-of-the-art methods. We then present Nonlinear Subspace Feature Enhancement (NSFE), an approach for nonlinearly embedding image sets into a space where they adhere to a more discriminative subspace structure. We describe how the structured loss function of NSFE can be optimized in a batch-by-batch fashion by a two-step alternating algorithm. The algorithm makes very few assumptions about the form of the embedding to be learned and is compatible with stochastic gradient descent and back-propagation. We evaluate NSFE with different types of input features and nonlinear embeddings and show that NSFE compares favorably to state-of-the-art image set classification methods. Finally, we propose a hierarchical approach for deep metric learning and descriptor matching for the task of point correspondence estimation. Our idea is motivated by the observation that existing metric learning approaches based on supervising and matching with only the deepest layer result in features that are suboptimal in some aspects to shallower features. Instead, the best matching performance, as we empirically show, is obtained by combining the high invariance of deeper features with the geometric sensitivity and higher precision of shallower features. We compare our method to state-of-the-art networks as well as fusion baselines inspired from existing semantic segmentation networks and empirically show that our method is more accurate and better suited to correspondence estimation

    Robust spatio-temporal latent variable models

    Get PDF
    Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) are widely-used mathematical models for decomposing multivariate data. They capture spatial relationships between variables, but ignore any temporal relationships that might exist between observations. Probabilistic PCA (PPCA) and Probabilistic CCA (ProbCCA) are versions of these two models that explain the statistical properties of the observed variables as linear mixtures of an alternative, hypothetical set of hidden, or latent, variables and explicitly model noise. Both the noise and the latent variables are assumed to be Gaussian distributed. This thesis introduces two new models, named PPCA-AR and ProbCCA-AR, that augment PPCA and ProbCCA respectively with autoregressive processes over the latent variables to additionally capture temporal relationships between the observations. To make PPCA-AR and ProbCCA-AR robust to outliers and able to model leptokurtic data, the Gaussian assumptions are replaced with infinite scale mixtures of Gaussians, using the Student-t distribution. Bayesian inference calculates posterior probability distributions for each of the parameter variables, from which we obtain a measure of confidence in the inference. It avoids the pitfalls associated with the maximum likelihood method: integrating over all possible values of the parameter variables guards against overfitting. For these new models the integrals required for exact Bayesian inference are intractable; instead a method of approximation, the variational Bayesian approach, is used. This enables the use of automatic relevance determination to estimate the model orders. PPCA-AR and ProbCCA-AR can be viewed as linear dynamical systems, so the forward-backward algorithm, also known as the Baum-Welch algorithm, is used as an efficient method for inferring the posterior distributions of the latent variables. The exact algorithm is tractable because Gaussian assumptions are made regarding the distribution of the latent variables. This thesis introduces a variational Bayesian forward-backward algorithm based on Student-t assumptions. The new models are demonstrated on synthetic datasets and on real remote sensing and EEG data

    Separation Principles in Independent Process Analysis

    Get PDF

    Bayesian inference by active sampling

    Get PDF

    Digging out the debris of the Milky Way past accretion events with Machine Learning

    Get PDF
    Nuestra posición privilegiada en el Universo hace de la Vía Láctea el laboratorio perfecto para entender los mecanismos físicos que llevan a la formación de sus diferentes estructuras. En las últimas décadas, estos estudios se han visto impulsados debido a la mejora en la calidad de los datos, gracias a proyectos como el Sloan Digital Sky Survey, que permite estudiar el desplazamiento al rojo espectroscópico para un gran número de estrellas y tomar imágenes multiespectrales, o misiones como Gaia, que proporciona un catálogo de datos astronómicos con precisiones sin precedentes. Asimismo, es de gran relevancia la mejora en la capacidad computacional, que ha impulsado el desarollo de simulaciones cosmológicas. Por otro lado, el paradigma estándar de la cosmología, Lambda cold dark matter (ΛCDM), indica que las galaxias de menor tamaño son las primeras en formarse y que las galaxias mayores, como la Vía Láctea, son el resultado de procesos de acreción y fusión de galaxias de menor tamaño, junto a la acreción del gas. Estos procesos de acreción y fusión de galaxias dejan marcas observables en la actualidad y que esperan encontrarse, esencialmente, en el espacio de las integrales de movimiento de las estrellas del halo en forma de cúmulos. No obstante, existen varios procesos, como la fricción dinámica o el aumento de la masa de la Vía Láctea con el tiempo, que hacen que estas cantidades no se conserven en su totalidad. El objetivo del presente trabajo es desentrañar la historia del halo estelar de la Vía Láctea mediante la identificación de estos cúmulos en el espacio de fases, haciendo uso para ello de técnicas de Machine Learning no supervisado. Específicamente, se ha recurrido a un modelo de mezcla Gaussiana (Gaussian Mixture) tras comprobarse que, de entre los métodos considerados, es el que conduce a la mejor identificación de las diferentes sobre-densidades como grupos independientes. Este modelo se basa en la probabilidad de que un cierto punto pertenezca a una distribución en forma de Gaussiana multi-dimensional y permite obtener sus parámetros característicos (pesos, valores medios y matrices de covarianza), los cuales son iniciados haciendo uso del método de Machine Learning conocido como K-Means. Concretamente, se emplea el método de la Bayesian Gaussian Mixture, que emplea la regla de Bayes para encontrar el número adecuado de cúmulos dado un un límite superior en el número de componentes que puede determinar. A su vez, en este modelo se emplea una asignación a priori de las probabilidades asociadas a cada uno de los componentes mediante el llamado proceso de Dirichlet. Posteriormente, el modelo óptimo es encontrado a través del algoritmo de esperanza-maximización. Se utilizan valores como el Bayesian Information Criterion (BIC) o la log-likelihood para poder comparar los diferentes modelos. El primer paso necesario para desarrollar este método ha sido la familiarización con esta técnica haciendo uso de conjuntos de datos controlados; concretamente, de datos generados mediante Gaussianas cuyos parámetros son conocidos. De este modo, se aprecia el efecto que tiene la variación de los diferentes parámetros de entrada que requiere el método, así como sus limitaciones. Esto ha permitido concluir que, efectivamente, es posible recuperar los puntos generados por las diferentes Gaussianas como cúmulos independientes mediante el modelo de Bayesian Gaussian Mixture. El siguiente paso ha sido implementar estos métodos para trabajar con halos simulados en el paradigma ΛCDM de la colaboración Auriga, correspondientes a simulaciones magneto hidrodinámicas de alta resolución de galaxias análogas a la Vía Láctea. En este caso, las partículas de estrellas cuentan con una etiqueta que indica su origen (como podría ser una galaxia pequeña acretada, a la que nos referiremos como su progenitor), de modo que es posible comparar lo obtenido con los modelos de Bayesian Gaussian Mixture con los resultados que serían esperables. A continuación, con el fin de familiarizarnos con los datos de las simulaciones, se ha empezando haciendo una inspección visual del espacio constituido por la energía total y el momento angular en torno al eje dicular al plano del disco de las partículas pertenecientes al halo estelar, ampliamente usados para estudiar los procesos de acreción/fusión en la Vía Láctea, para diferentes rangos de radio en torno al centro Galáctico y diferentes rangos de metalicidad total. Seguidamente, se han visualizado diferentes espacios de las cantidades en las que se espera encontrar sobre-densidades asociadas a cada progenitor, es decir, a cada galaxia satélite a la que pertenecían las estrellas antes de que los procesos de acreción tuviesen lugar. Esto ha demostrado la dificultad intrínseca de la tarea que se pretende realizar, debido a la superposición existente entre galaxias satélite en los espacios considerados y al hecho de que un progenitor no se asocia a una única sobre-densidad. Esto último lleva, además, a que no es posible recuperar cada uno de los progenitores como una única Gaussiana. No obstante, este efecto es más importante en el caso de los progenitores más masivos. Posteriormente, se ha buscado el conjunto de estrellas compuesto por los 4 progenitores más masivos en el rango de radios y metalicidades en el que estos se distinguen con mayor facilidad en el espacio constituido por la energía total y el momento angular a lo largo del eje perpendicular al plano del disco. Luego, se ha aplicado en este subconjunto de datos el método de Bayesian Gaussian Mixture en los diferentes espacios en los que se espera que las estrellas que perteneciesen a un mismo progenitor aparezcan como cúmulos, obteniéndose resultados muy semejantes. En consecuencia, se ha decidido centrar la atención en el espacio constituido por la energía total y el momento angular vertical, junto al momento angular perpendicular, por ser más fáciles de interpretar. Una vez hecho esto, se ha aplicado la Bayesian Gaussian Mixture en el rango completo de radios y metalicidades a los datos correspondientes a los 4 progenitores más masivos, así como a otros 4 cuyas masas se encuentran en un rango intermedio. De este modo, se han identificado las diferentes sobre-densidades como múltiples Gaussianas independientes, si bien no se ha establecido aún ningún enlace entre ellas y las galaxias satélites originales. Por este motivo, a continuación, se ha procedido a intentar relacionar las differentes Gaussianas haciendo uso de las distancias de Mahalanobis y del método de enlace pesado de forma jerárquica. Se ha llegado con esto a que, si bien no ha sido posible determinar el origen de las sobre-densidades en el espacio de integrales de movimiento al relaciolarlas entre ellas para progenitores de mayor masa, esta sí es una opción viable en rangos de masa menores. Con esto, se llega a que sería necesario desarrollar un método alternativo que permita estudiar los progenitores más pesados, para luego poder estudiar únicamente los de menor masa y aplicar métodos de Machine Learning de agrupamiento, junto a métodos de enlace, para así identificar los cúmulos restantes. Asimismo, sería de interés hacer otro tipo de pruebas con simulaciones con datos más realistas, es decir, con una mayor semejanza con los datos observacionales, así como realizar un estudio más profundo de la información que puede extraerse de las metalicidades, con el fin de cumplir el objetivo presentado

    Domain Adaptation and Privileged Information for Visual Recognition

    Get PDF
    The automatic identification of entities like objects, people or their actions in visual data, such as images or video, has significantly improved, and is now being deployed in access control, social media, online retail, autonomous vehicles, and several other applications. This visual recognition capability leverages supervised learning techniques, which require large amounts of labeled training data from the target distribution representative of the particular task at hand. However, collecting such training data might be expensive, require too much time, or even be impossible. In this work, we introduce several novel approaches aiming at compensating for the lack of target training data. Rather than leveraging prior knowledge for building task-specific models, typically easier to train, we focus on developing general visual recognition techniques, where the notion of prior knowledge is better identified by additional information, available during training. Depending on the nature of such information, the learning problem may turn into domain adaptation (DA), domain generalization (DG), leaning using privileged information (LUPI), or domain adaptation with privileged information (DAPI).;When some target data samples are available and additional information in the form of labeled data from a different source is also available, the learning problem becomes domain adaptation. Unlike previous DA work, we introduce two novel approaches for the few-shot learning scenario, which require only very few labeled target samples, and even one can be very effective. The first method exploits a Siamese deep neural network architecture for learning an embedding where visual categories from the source and target distributions are semantically aligned and yet maximally separated. The second approach instead, extends adversarial learning to simultaneously maximize the confusion between source and target domains while achieving semantic alignment.;In complete absence of target data, several cheaply available source datasets related to the target distribution can be leveraged as additional information for learning a task. This is the domain generalization setting. We introduce the first deep learning approach to address the DG problem, by extending a Siamese network architecture for learning a representation of visual categories that is invariant with respect to the sources, while imposing semantic alignment and class separation to maximize generalization performance on unseen target domains.;There are situations in which target data for training might come equipped with additional information that can be modeled as an auxiliary view of the data, and that unfortunately is not available during testing. This is the LUPI scenario. We introduce a novel framework based on the information bottleneck that leverages the auxiliary view to improve the performance of visual classifiers. We do so by introducing a formulation that is general, in the sense that can be used with any visual classifier.;Finally, when the available target data is unlabeled, and there is closely related labeled source data, which is also equipped with an auxiliary view as additional information, we pose the question of how to leverage the source data views to train visual classifiers for unseen target data. This is the DAPI scenario. We extend the LUPI framework based on the information bottleneck to learn visual classifiers in DAPI settings and show that privileged information can be leveraged to improve the learning on new domains. Also, the novel DAPI framework is general and can be used with any visual classifier.;Every use of auxiliary information has been validated extensively using publicly available benchmark datasets, and several new state-of-the-art accuracy performance values have been set. Examples of application domains include visual object recognition from RGB images and from depth data, handwritten digit recognition, and gesture recognition from video
    corecore