272 research outputs found

    Unsupervised Domain Adaptation with Copula Models

    Full text link
    We study the task of unsupervised domain adaptation, where no labeled data from the target domain is provided during training time. To deal with the potential discrepancy between the source and target distributions, both in features and labels, we exploit a copula-based regression framework. The benefits of this approach are two-fold: (a) it allows us to model a broader range of conditional predictive densities beyond the common exponential family, (b) we show how to leverage Sklar's theorem, the essence of the copula formulation relating the joint density to the copula dependency functions, to find effective feature mappings that mitigate the domain mismatch. By transforming the data to a copula domain, we show on a number of benchmark datasets (including human emotion estimation), and using different regression models for prediction, that we can achieve a more robust and accurate estimation of target labels, compared to recently proposed feature transformation (adaptation) methods.Comment: IEEE International Workshop On Machine Learning for Signal Processing 201

    Learning Invariant Representations for Deep Latent Variable Models

    Get PDF
    Deep latent variable models introduce a new class of generative models which are able to handle unstructured data and encode non-linear dependencies. Despite their known flexibility, these models are frequently not invariant against target-specific transformations. Therefore, they suffer from model mismatches and are challenging to interpret or control. We employ the concept of symmetry transformations from physics to formally describe these invariances. In this thesis, we investigate how we can model invariances when a symmetry transformation is either known or unknown. As a consequence, we make contributions in the domain of variable compression under side information and generative modelling. In our first contribution, we investigate the problem where a symmetry transformation is known yet not implicitly learned by the model. Specifically, we consider the task of estimating mutual information in the context of the deep information bottleneck which is not invariant against monotone transformations. To address this limitation, we extend the deep information bottleneck with a copula construction. In our second contribution, we address the problem of learning target-invariant subspaces for generative models. In this case, the symmetry transformation is unknown and has to be learned from data. We achieve this by formulating a deep information bottleneck with a target and a target-invariant subspace. To ensure invariance, we provide a continuous mutual information regulariser based on adversarial training. In our last contribution, we introduce an improved method for learning unknown symmetry transformations with cycle-consistency. To do so, we employ the equivalent deep information bottleneck method with a partitioned latent space. However, we ensure target-invariance by utilizing a cycle-consistency loss in the latent space. As a result, we overcome potential convergence issues introduced by adversarial training and are able to deal with mixed data. In summary, each of our presented models provide an attempt to better control and understand deep latent variables models by learning symmetry transformations. We demonstrated the effectiveness of our contributions with an extensive evaluation on both artificial and real-world experiments

    Learning patterns from sequential and network data using probabilistic models

    Get PDF
    The focus of this thesis is on developing probabilistic models for data observed over temporal and graph domains, and the corresponding variational inference algorithms. In many real-world phenomena, sequential data points that are observed closer in time often exhibit higher degrees of dependency. Similarly, data points observed over a graph domain (e.g., user interests in a social network) may exhibit higher dependencies with lower degrees of separation over the graph. Furthermore, the connectivity structures that define the graph domain can also evolve temporally (i.e., temporal networks) and exhibit dependencies over time. The data sets observed over temporal and graph domains often (but not always) violate the independent and identically distributed (i.i.d.) assumption made by many mathematical models. The works presented in this dissertation address various challenges in modelling data sets that exhibit dependencies over temporal and graph domains. In Chapter 3, I present a stochastic variational inference algorithm that enables factorial hidden Markov models for sequential data to scale up to extremely long sequences. In Chapter 4, I propose a simple but powerful Gaussian process model that captures the dependencies of data points observed on a graph domain, and demonstrate its viability in graph-based semi-supervised learning problems. In Chapter 5, I present a dynamical model for graphs that captures the temporal evolution of the connectivity structures as well as the sparse connectivity structures often observed in temporal real network data sets. Finally, I summarise the contributions of the thesis and propose several directions for future works that can build on the proposed methods in Chapter 6

    Learning Domain Invariant Representations by Joint Wasserstein Distance Minimization

    Get PDF
    Domain shifts in the training data are common in practical applications of machine learning, they occur for instance when the data is coming from different sources. Ideally, a ML model should work well independently of these shifts, for example, by learning a domain-invariant representation. Moreover, privacy concerns regarding the source also require a domain-invariant representation. In this work, we provide theoretical results that link domain invariant representations -- measured by the Wasserstein distance on the joint distributions -- to a practical semi-supervised learning objective based on a cross-entropy classifier and a novel domain critic. Quantitative experiments demonstrate that the proposed approach is indeed able to practically learn such an invariant representation (between two domains), and the latter also supports models with higher predictive accuracy on both domains, comparing favorably to existing techniques.Comment: 20 pages including appendix. Under Revie
    • …
    corecore