146 research outputs found

    Nonparametric Identifiability of Causal Representations from Unknown Interventions

    Full text link
    We study causal representation learning, the task of inferring latent causal variables and their causal relations from high-dimensional functions ("mixtures") of the variables. Prior work relies on weak supervision, in the form of counterfactual pre- and post-intervention views or temporal structure; places restrictive assumptions, such as linearity, on the mixing function or latent causal model; or requires partial knowledge of the generative process, such as the causal graph or the intervention targets. We instead consider the general setting in which both the causal model and the mixing function are nonparametric. The learning signal takes the form of multiple datasets, or environments, arising from unknown interventions in the underlying causal model. Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data. We study the fundamental setting of two causal variables and prove that the observational distribution and one perfect intervention per node suffice for identifiability, subject to a genericity condition. This condition rules out spurious solutions that involve fine-tuning of the intervened and observational distributions, mirroring similar conditions for nonlinear cause-effect inference. For an arbitrary number of variables, we show that two distinct paired perfect interventions per node guarantee identifiability. Further, we demonstrate that the strengths of causal influences among the latent variables are preserved by all equivalent solutions, rendering the inferred representation appropriate for drawing causal conclusions from new data. Our study provides the first identifiability results for the general nonparametric setting with unknown interventions, and elucidates what is possible and impossible for causal representation learning without more direct supervision

    Combining experiments to discover linear cyclic models with latent variables

    Get PDF
    Volume: Vol 9 : AISTATS 2010 Host publication title: Proceedings of the 13th International Conference on Artificial Intelligence and StatisticsPeer reviewe

    Learning Linear Causal Representations from Interventions under General Nonlinear Mixing

    Full text link
    We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is completely general. We prove strong identifiability results given unknown single-node interventions, i.e., without having access to the intervention targets. This generalizes prior works which have focused on weaker classes, such as linear maps or paired counterfactual data. This is also the first instance of causal identifiability from non-paired interventions for deep neural network embeddings. Our proof relies on carefully uncovering the high-dimensional geometric structure present in the data distribution after a non-linear density transformation, which we capture by analyzing quadratic forms of precision matrices of the latent distributions. Finally, we propose a contrastive algorithm to identify the latent variables in practice and evaluate its performance on various tasks.Comment: 38 page

    Learning nonparametric latent causal graphs with unknown interventions

    Full text link
    We establish conditions under which latent causal graphs are nonparametrically identifiable and can be reconstructed from unknown interventions in the latent space. Our primary focus is the identification of the latent structure in measurement models without parametric assumptions such as linearity or Gaussianity. Moreover, we do not assume the number of hidden variables is known, and we show that at most one unknown intervention per hidden variable is needed. This extends a recent line of work on learning causal representations from observations and interventions. The proofs are constructive and introduce two new graphical concepts -- imaginary subsets and isolated edges -- that may be useful in their own right. As a matter of independent interest, the proofs also involve a novel characterization of the limits of edge orientations within the equivalence class of DAGs induced by unknown interventions. These are the first results to characterize the conditions under which causal representations are identifiable without making any parametric assumptions in a general setting with unknown interventions and without faithfulness.Comment: To appear at NeurIPS 202

    Advancing probabilistic and causal deep learning in medical image analysis

    Get PDF
    The power and flexibility of deep learning have made it an indispensable tool for tackling modern machine learning problems. However, this flexibility comes at the cost of robustness and interpretability, which can lead to undesirable or even harmful outcomes. Deep learning models often fail to generalise to real-world conditions and produce unforeseen errors that hinder wide adoption in safety-critical critical domains such as healthcare. This thesis presents multiple works that address the reliability problems of deep learning in safety-critical domains by being aware of its vulnerabilities and incorporating more domain knowledge when designing and evaluating our algorithms. We start by showing how close collaboration with domain experts is necessary to achieve good results in a real-world clinical task - the multiclass semantic segmentation of traumatic brain injuries (TBI) lesions in head CT. We continue by proposing an algorithm that models spatially coherent aleatoric uncertainty in segmentation tasks by considering the dependencies between pixels. The lack of proper uncertainty quantification is a robustness issue which is ubiquitous in deep learning. Tackling this issue is of the utmost importance if we want to deploy these systems in the real world. Lastly, we present a general framework for evaluating image counterfactual inference models in the absence of ground-truth counterfactuals. Counterfactuals are extremely useful to reason about models and data and to probe models for explanations or mistakes. As a result, their evaluation is critical for improving the interpretability of deep learning models.Open Acces
    corecore