4,815 research outputs found
False Discovery Rate Controlled Heterogeneous Treatment Effect Detection for Online Controlled Experiments
Online controlled experiments (a.k.a. A/B testing) have been used as the
mantra for data-driven decision making on feature changing and product shipping
in many Internet companies. However, it is still a great challenge to
systematically measure how every code or feature change impacts millions of
users with great heterogeneity (e.g. countries, ages, devices). The most
commonly used A/B testing framework in many companies is based on Average
Treatment Effect (ATE), which cannot detect the heterogeneity of treatment
effect on users with different characteristics. In this paper, we propose
statistical methods that can systematically and accurately identify
Heterogeneous Treatment Effect (HTE) of any user cohort of interest (e.g.
mobile device type, country), and determine which factors (e.g. age, gender) of
users contribute to the heterogeneity of the treatment effect in an A/B test.
By applying these methods on both simulation data and real-world
experimentation data, we show how they work robustly with controlled low False
Discover Rate (FDR), and at the same time, provides us with useful insights
about the heterogeneity of identified user groups. We have deployed a toolkit
based on these methods, and have used it to measure the Heterogeneous Treatment
Effect of many A/B tests at Snap
Distributional Robustness of K-class Estimators and the PULSE
Recently, in causal discovery, invariance properties such as the moment
criterion which two-stage least square estimator leverage have been exploited
for causal structure learning: e.g., in cases, where the causal parameter is
not identifiable, some structure of the non-zero components may be identified,
and coverage guarantees are available. Subsequently, anchor regression has been
proposed to trade-off invariance and predictability. The resulting estimator is
shown to have optimal predictive performance under bounded shift interventions.
In this paper, we show that the concepts of anchor regression and K-class
estimators are closely related. Establishing this connection comes with two
benefits: (1) It enables us to prove robustness properties for existing K-class
estimators when considering distributional shifts. And, (2), we propose a novel
estimator in instrumental variable settings by minimizing the mean squared
prediction error subject to the constraint that the estimator lies in an
asymptotically valid confidence region of the causal parameter. We call this
estimator PULSE (p-uncorrelated least squares estimator) and show that it can
be computed efficiently, even though the underlying optimization problem is
non-convex. We further prove that it is consistent. We perform simulation
experiments illustrating that there are several settings including weak
instrument settings, where PULSE outperforms other estimators and suffers from
less variability.Comment: 85 pages, 15 figure
Causal Relations via Econometrics
Applied econometric work takes a superficial approach to causality. Understanding economic affairs, making good policy decisions, and progress in the economic discipline depend on our ability to infer causal relations from data. We review the dominant approaches to causality in econometrics, and suggest why they fail to give good results. We feel the problem cannot be solved by traditional tools, and requires some out-of-the-box thinking. Potentially promising approaches to solutions are discussed.causality, regression, Granger Causality, Exogeneity, Cowles Commission, Hendry Methodology, Natural Experiments
Causal Discovery with Continuous Additive Noise Models
We consider the problem of learning causal directed acyclic graphs from an
observational joint distribution. One can use these graphs to predict the
outcome of interventional experiments, from which data are often not available.
We show that if the observational distribution follows a structural equation
model with an additive noise structure, the directed acyclic graph becomes
identifiable from the distribution under mild conditions. This constitutes an
interesting alternative to traditional methods that assume faithfulness and
identify only the Markov equivalence class of the graph, thus leaving some
edges undirected. We provide practical algorithms for finitely many samples,
RESIT (Regression with Subsequent Independence Test) and two methods based on
an independence score. We prove that RESIT is correct in the population setting
and provide an empirical evaluation
Causal Relations via Econometrics
Applied econometric work takes a superficial approach to causality. Understanding economic affairs, making good policy decisions, and progress in the economic discipline depend on our ability to infer causal relations from data. We review the dominant approaches to causality in econometrics, and suggest why they fail to give good results. We feel the problem cannot be solved by traditional tools, and requires some out-of-the-box thinking. Potentially promising approaches to solutions are discussed.Causality, Regression, Exogeneity, Hendry Methodology, Natural Experiments
Discovering Causal Relations and Equations from Data
Physics is a field of science that has traditionally used the scientific
method to answer questions about why natural phenomena occur and to make
testable models that explain the phenomena. Discovering equations, laws and
principles that are invariant, robust and causal explanations of the world has
been fundamental in physical sciences throughout the centuries. Discoveries
emerge from observing the world and, when possible, performing interventional
studies in the system under study. With the advent of big data and the use of
data-driven methods, causal and equation discovery fields have grown and made
progress in computer science, physics, statistics, philosophy, and many applied
fields. All these domains are intertwined and can be used to discover causal
relations, physical laws, and equations from observational data. This paper
reviews the concepts, methods, and relevant works on causal and equation
discovery in the broad field of Physics and outlines the most important
challenges and promising future lines of research. We also provide a taxonomy
for observational causal and equation discovery, point out connections, and
showcase a complete set of case studies in Earth and climate sciences, fluid
dynamics and mechanics, and the neurosciences. This review demonstrates that
discovering fundamental laws and causal relations by observing natural
phenomena is being revolutionised with the efficient exploitation of
observational data, modern machine learning algorithms and the interaction with
domain knowledge. Exciting times are ahead with many challenges and
opportunities to improve our understanding of complex systems.Comment: 137 page
- âŠ