Search CORE

325 research outputs found

Nonlinear ICA Using Auxiliary Variables and Generalized Contrastive Learning

Author: Hyvärinen Aapo
Sasaki Hiroaki
Turner Richard E.
Publication venue: Journal of Machine Learning Research
Publication date: 01/01/2019
Field of study

Peer reviewe

arXiv.org e-Print Archive

Helsingin yliopiston digitaalinen arkisto

The Incomplete Rosetta Stone Problem: Identifiability Results for Multi-View Nonlinear ICA

Author: Gresele Luigi
Locatello Francesco
Mehrjou Arash
Rubenstein Paul K.
Schölkopf Bernhard
Publication venue
Publication date: 01/01/2019
Field of study

We consider the problem of recovering a common latent source with independent components from multiple views. This applies to settings in which a variable is measured with multiple experimental modalities, and where the goal is to synthesize the disparate measurements into a single unified representation. We consider the case that the observed views are a nonlinear mixing of component-wise corruptions of the sources. When the views are considered separately, this reduces to nonlinear Independent Component Analysis (ICA) for which it is provably impossible to undo the mixing. We present novel identifiability proofs that this is possible when the multiple views are considered jointly, showing that the mixing can theoretically be undone using function approximators such as deep neural networks. In contrast to known identifiability results for nonlinear ICA, we prove that independent latent sources with arbitrary mixing can be recovered as long as multiple, sufficiently different noisy views are available

arXiv.org e-Print Archive

MPG.PuRe

ICE-BeeM: Identifiable Conditional Energy-Based Deep Models

Author: Hyvärinen Aapo
Khemakhem Ilyes
Kingma Diederik P.
Monti Ricardo Pio
Publication venue: Neural Information Processing Systems Foundation
Publication date: 01/12/2020
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Robust contrastive learning and nonlinear ICA in the presence of outliers

Author: Hyvärinen Aapo
Monti Ricardo Pio
Sasaki Hiroaki
Takenouchi Takashi
Publication venue
Publication date: 01/11/2019
Field of study

Peer reviewe

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Helsingin yliopiston digitaalinen arkisto

HAL-CEA

Nonlinear Independent Component Analysis for Principled Disentanglement in Unsupervised Deep Learning

Author: Hyvarinen Aapo
Khemakhem Ilyes
Morioka Hiroshi
Publication venue
Publication date: 29/03/2023
Field of study

A central problem in unsupervised deep learning is how to find useful representations of high-dimensional data, sometimes called "disentanglement". Most approaches are heuristic and lack a proper theoretical foundation. In linear representation learning, independent component analysis (ICA) has been successful in many applications areas, and it is principled, i.e. based on a well-defined probabilistic model. However, extension of ICA to the nonlinear case has been problematic due to the lack of identifiability, i.e. uniqueness of the representation. Recently, nonlinear extensions that utilize temporal structure or some auxiliary information have been proposed. Such models are in fact identifiable, and consequently, an increasing number of algorithms have been developed. In particular, some self-supervised algorithms can be shown to estimate nonlinear ICA, even though they have initially been proposed from heuristic perspectives. This paper reviews the state-of-the-art of nonlinear ICA theory and algorithms

arXiv.org e-Print Archive

Advances in identifiability of nonlinear probabilistic models

Author: Khemakhem Ilyes
Publication venue: UCL (University College London)
Publication date: 28/05/2022
Field of study

Identifiability is a highly prized property of statistical models. This thesis investigates this property in nonlinear models encountered in two fields of statistics: representation learning and causal discovery. In representation learning, identifiability leads to learning interpretable and reproducible representations, while in causal discovery, it is necessary for the estimation of correct causal directions. We begin by leveraging recent advances in nonlinear ICA to show that the latent space of a VAE is identifiable up to a permutation and pointwise nonlinear transformations of its components. A factorized prior distribution over the latent variables conditioned on an auxiliary observed variable, such as a class label or nearly any other observation, is required for our result. We also extend previous identifiability results in nonlinear ICA to the case of noisy or undercomplete observations, and incorporate them into a maximum likelihood framework. Our second contribution is to develop the Independently Modulated Component Analysis (IMCA) framework, a generalization of nonlinear ICA to non-independent latent variables. We show that we can drop the independence assumption in ICA while maintaining identifiability, resulting in a very flexible and generic framework for principled disentangled representation learning. This finding is predicated on the existence of an auxiliary variable that modulates the joint distribution of the latent variables in a factorizable manner. As a third contribution, we extend the identifiability theory to a broad family of conditional energy-based models (EBMs). This novel model generalizes earlier results by removing any distributional assumptions on the representations, which are ubiquitous in the latent variable setting. The conditional EBM can learn identifiable overcomplete representations and has universal approximation capabilities/. Finally, we investigate a connection between the framework of autoregressive normalizing flow models and causal discovery. Causal models derived from affine autoregressive flows are shown to be identifiable, generalizing the wellknown additive noise model. Using normalizing flows, we can compute the exact likelihood of the causal model, which is subsequently used to derive a likelihood ratio measure for causal discovery. They are also invertible, making them perfectly suitable for performing causal inference tasks like interventions and counterfactuals

UCL Discovery

Learning Linear Causal Representations from Interventions under General Nonlinear Mixing

Author: Aragam Bryon
Buchholz Simon
Rajendran Goutham
Ravikumar Pradeep
Rosenfeld Elan
Schölkopf Bernhard
Publication venue
Publication date: 03/06/2023
Field of study

We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is completely general. We prove strong identifiability results given unknown single-node interventions, i.e., without having access to the intervention targets. This generalizes prior works which have focused on weaker classes, such as linear maps or paired counterfactual data. This is also the first instance of causal identifiability from non-paired interventions for deep neural network embeddings. Our proof relies on carefully uncovering the high-dimensional geometric structure present in the data distribution after a non-linear density transformation, which we capture by analyzing quadratic forms of precision matrices of the latent distributions. Finally, we propose a contrastive algorithm to identify the latent variables in practice and evaluate its performance on various tasks.Comment: 38 page

arXiv.org e-Print Archive