15,742 research outputs found
Online Causal Structure Learning in the Presence of Latent Variables
We present two online causal structure learning algorithms which can track
changes in a causal structure and process data in a dynamic real-time manner.
Standard causal structure learning algorithms assume that causal structure does
not change during the data collection process, but in real-world scenarios, it
does often change. Therefore, it is inappropriate to handle such changes with
existing batch-learning approaches, and instead, a structure should be learned
in an online manner. The online causal structure learning algorithms we present
here can revise correlation values without reprocessing the entire dataset and
use an existing model to avoid relearning the causal links in the prior model,
which still fit data. Proposed algorithms are tested on synthetic and
real-world datasets, the latter being a seasonally adjusted commodity price
index dataset for the U.S. The online causal structure learning algorithms
outperformed standard FCI by a large margin in learning the changed causal
structure correctly and efficiently when latent variables were present.Comment: 16 pages, 9 figures, 2 table
Learning Adjustment Sets from Observational and Limited Experimental Data
Estimating causal effects from observational data is not always possible due
to confounding. Identifying a set of appropriate covariates (adjustment set)
and adjusting for their influence can remove confounding bias; however, such a
set is typically not identifiable from observational data alone. Experimental
data do not have confounding bias, but are typically limited in sample size and
can therefore yield imprecise estimates. Furthermore, experimental data often
include a limited set of covariates, and therefore provide limited insight into
the causal structure of the underlying system. In this work we introduce a
method that combines large observational and limited experimental data to
identify adjustment sets and improve the estimation of causal effects. The
method identifies an adjustment set (if possible) by calculating the marginal
likelihood for the experimental data given observationally-derived prior
probabilities of potential adjustmen sets. In this way, the method can make
inferences that are not possible using only the conditional dependencies and
independencies in all the observational and experimental data. We show that the
method successfully identifies adjustment sets and improves causal effect
estimation in simulated data, and it can sometimes make additional inferences
when compared to state-of-the-art methods for combining experimental and
observational data.Comment: 10 pages, 5 figure
Joint estimation of multiple related biological networks
Graphical models are widely used to make inferences concerning interplay in
multivariate systems. In many applications, data are collected from multiple
related but nonidentical units whose underlying networks may differ but are
likely to share features. Here we present a hierarchical Bayesian formulation
for joint estimation of multiple networks in this nonidentically distributed
setting. The approach is general: given a suitable class of graphical models,
it uses an exchangeability assumption on networks to provide a corresponding
joint formulation. Motivated by emerging experimental designs in molecular
biology, we focus on time-course data with interventions, using dynamic
Bayesian networks as the graphical models. We introduce a computationally
efficient, deterministic algorithm for exact joint inference in this setting.
We provide an upper bound on the gains that joint estimation offers relative to
separate estimation for each network and empirical results that support and
extend the theory, including an extensive simulation study and an application
to proteomic data from human cancer cell lines. Finally, we describe
approximations that are still more computationally efficient than the exact
algorithm and that also demonstrate good empirical performance.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS761 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Non-Parametric Causality Detection: An Application to Social Media and Financial Data
According to behavioral finance, stock market returns are influenced by
emotional, social and psychological factors. Several recent works support this
theory by providing evidence of correlation between stock market prices and
collective sentiment indexes measured using social media data. However, a pure
correlation analysis is not sufficient to prove that stock market returns are
influenced by such emotional factors since both stock market prices and
collective sentiment may be driven by a third unmeasured factor. Controlling
for factors that could influence the study by applying multivariate regression
models is challenging given the complexity of stock market data. False
assumptions about the linearity or non-linearity of the model and inaccuracies
on model specification may result in misleading conclusions.
In this work, we propose a novel framework for causal inference that does not
require any assumption about the statistical relationships among the variables
of the study and can effectively control a large number of factors. We apply
our method in order to estimate the causal impact that information posted in
social media may have on stock market returns of four big companies. Our
results indicate that social media data not only correlate with stock market
returns but also influence them.Comment: Physica A: Statistical Mechanics and its Applications 201
- …