12 research outputs found
Disentangling causal webs in the brain using functional Magnetic Resonance Imaging: A review of current approaches
In the past two decades, functional Magnetic Resonance Imaging has been used
to relate neuronal network activity to cognitive processing and behaviour.
Recently this approach has been augmented by algorithms that allow us to infer
causal links between component populations of neuronal networks. Multiple
inference procedures have been proposed to approach this research question but
so far, each method has limitations when it comes to establishing whole-brain
connectivity patterns. In this work, we discuss eight ways to infer causality
in fMRI research: Bayesian Nets, Dynamical Causal Modelling, Granger Causality,
Likelihood Ratios, LiNGAM, Patel's Tau, Structural Equation Modelling, and
Transfer Entropy. We finish with formulating some recommendations for the
future directions in this area
Identification and Estimation of Causal Effects Using non-Gaussianity and Auxiliary Covariates
Assessing causal effects in the presence of unmeasured confounding is a
challenging problem. Although auxiliary variables, such as instrumental
variables, are commonly used to identify causal effects, they are often
unavailable in practice due to stringent and untestable conditions. To address
this issue, previous researches have utilized linear structural equation models
to show that the causal effect can be identifiable when noise variables of the
treatment and outcome are both non-Gaussian. In this paper, we investigate the
problem of identifying the causal effect using auxiliary covariates and
non-Gaussianity from the treatment. Our key idea is to characterize the impact
of unmeasured confounders using an observed covariate, assuming they are all
Gaussian. The auxiliary covariate can be an invalid instrument or an invalid
proxy variable. We demonstrate that the causal effect can be identified using
this measured covariate, even when the only source of non-Gaussianity comes
from the treatment. We then extend the identification results to the
multi-treatment setting and provide sufficient conditions for identification.
Based on our identification results, we propose a simple and efficient
procedure for calculating causal effects and show the -consistency of
the proposed estimator. Finally, we evaluate the performance of our estimator
through simulation studies and an application.Comment: 16 papges, 7 Figure
Generalized Independent Noise Condition for Estimating Causal Structure with Latent Variables
We investigate the challenging task of learning causal structure in the
presence of latent variables, including locating latent variables and
determining their quantity, and identifying causal relationships among both
latent and observed variables. To address this, we propose a Generalized
Independent Noise (GIN) condition for linear non-Gaussian acyclic causal models
that incorporate latent variables, which establishes the independence between a
linear combination of certain measured variables and some other measured
variables. Specifically, for two observed random vectors and ,
GIN holds if and only if and are
independent, where is a non-zero parameter vector determined by the
cross-covariance between and . We then give necessary
and sufficient graphical criteria of the GIN condition in linear non-Gaussian
acyclic causal models. Roughly speaking, GIN implies the existence of an
exogenous set relative to the parent set of (w.r.t.
the causal ordering), such that d-separates from
. Interestingly, we find that the independent noise condition
(i.e., if there is no confounder, causes are independent of the residual
derived from regressing the effect on the causes) can be seen as a special case
of GIN. With such a connection between GIN and latent causal structures, we
further leverage the proposed GIN condition, together with a well-designed
search procedure, to efficiently estimate Linear, Non-Gaussian Latent
Hierarchical Models (LiNGLaHs), where latent confounders may also be causally
related and may even follow a hierarchical structure. We show that the
underlying causal structure of a LiNGLaH is identifiable in light of GIN
conditions under mild assumptions. Experimental results show the effectiveness
of the proposed approach
On the Importance of Transition Matrix for Learning with Noisy Labels
To improve the generalization ability of deep learning models when training data contains noisy labels, a noise transition matrix T(x) has been widely employed to reveal the transition relationship from clean labels to noisy labels of instances. It acts as a crucial building block in designing statistical-consistent methods for learning with noisy labels (T-based methods). However, for real-world datasets, the transition matrix is usually unknown and needs to be estimated. Accurately estimating the transition matrix can be a challenging task. This motivates recent work to design label-noise robust methods focusing on incorporating heuristics instead of requiring estimating the transition matrix (heuristic-based methods). The heuristic-based method has demonstrated state-of-the-art (SOTA) performance on many benchmark datasets. These methods seem to be more practical than T-based methods. It raises the question that is the transition matrix still important for learning with noisy labels.
In this thesis, we answer that the transition matrix still plays an important role in learning with noisy labels. We will show that the transition matrix not only can be used to design statistical-consistent methods but also can help boost the performance of heuristic-based methods. We will also show that given the transition matrix, the performance of T-based methods will not be influenced by different data generative processes. By contrast, the performance of SOTA heuristic-based methods can be influenced by different data generative processes. Since the label-noise transition matrix is important but hard to estimate, we will propose two new transition-matrix estimation methods that reduce the estimation error of the transition matrix. The first method can effectively estimate instance-independent transition matrix by exploiting the divide-and-conquer paradigm. The second method focuses on estimating instance-dependent transition matrices by leveraging a structural causal model
Learning Identifiable Representations: Independent Influences and Multiple Views
Intelligent systems, whether biological or artificial, perceive unstructured information from the world around them: deep neural networks designed for object recognition receive collections of pixels as inputs; living beings capture visual stimuli through photoreceptors that convert incoming light into electrical signals. Sophisticated signal processing is required to extract meaningful features (e.g., the position, dimension, and colour of objects in an image) from these inputs: this motivates the field of representation learning. But what features should be deemed meaningful, and how to learn them?
We will approach these questions based on two metaphors. The first one is the cocktail-party problem, where a number of conversations happen in parallel in a room, and the task is to recover (or separate) the voices of the individual speakers from recorded mixtures—also termed blind source separation. The second one is what we call the independent-listeners problem: given two listeners in front of some loudspeakers, the question is whether, when processing what they hear, they will make the same information explicit, identifying similar constitutive elements. The notion of identifiability is crucial when studying these problems, as it specifies suitable technical assumptions under which representations are uniquely determined, up to tolerable ambiguities like latent source reordering. A key result of this theory is that, when the mixing is nonlinear, the model is provably non-identifiable. A first question is, therefore, under what additional assumptions (ideally as mild as possible) the problem becomes identifiable; a second one is, what algorithms can be used to estimate the model.
The contributions presented in this thesis address these questions and revolve around two main principles. The first principle is to learn representation where the latent components influence the observations independently. Here the term “independently” is used in a non-statistical sense—which can be loosely thought of as absence of fine-tuning between distinct elements of a generative process. The second principle is that representations can be learned from paired observations or views, where mixtures of the same latent variables are observed, and they (or a subset thereof) are perturbed in one of the views—also termed multi-view setting. I will present work characterizing these two problem settings, studying their identifiability and proposing suitable estimation algorithms. Moreover, I will discuss how the success of popular representation learning methods may be explained in terms of the principles above and describe an application of the second principle to the statistical analysis of group studies in neuroimaging