1,008 research outputs found
The Augmented Synthetic Control Method
The synthetic control method (SCM) is a popular approach for estimating the
impact of a treatment on a single unit in panel data settings. The "synthetic
control" is a weighted average of control units that balances the treated
unit's pre-treatment outcomes as closely as possible. A critical feature of the
original proposal is to use SCM only when the fit on pre-treatment outcomes is
excellent. We propose Augmented SCM as an extension of SCM to settings where
such pre-treatment fit is infeasible. Analogous to bias correction for inexact
matching, Augmented SCM uses an outcome model to estimate the bias due to
imperfect pre-treatment fit and then de-biases the original SCM estimate. Our
main proposal, which uses ridge regression as the outcome model, directly
controls pre-treatment fit while minimizing extrapolation from the convex hull.
This estimator can also be expressed as a solution to a modified synthetic
controls problem that allows negative weights on some donor units. We bound the
estimation error of this approach under different data generating processes,
including a linear factor model, and show how regularization helps to avoid
over-fitting to noise. We demonstrate gains from Augmented SCM with extensive
simulation studies and apply this framework to estimate the impact of the 2012
Kansas tax cuts on economic growth. We implement the proposed method in the new
augsynth R package
Recommended from our members
Comments on the cybernetics of stability and regulation in social systems
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The methods and principles of cybernetics are applied to a discussion of stability and regulation in social systems taking a global viewpoint. The fundamental but still classical notion of stability as applied to homeostatic and ultrastable systems is discussed, with a particular reference to a specific well-studied example of a closed social group (the Tsembaga studied by Roy Rappaport in New Guinea).
The discussion extends to the problem of evolution in large systems and the question of regulating evolution is addressed without special qualifications. A more comprehensive idea of stability is introduced as the argument turns to the problem of evolution for viability in general.
Concepts pertaining to the problem of evolution are exemplified by a computer simulation model of an abstractly defined ecosystem in which various dynamic processes occur allowing the study of adaptive and evolutionary behaviour. In particular, the role of coalition formation and cooperative behaviour is stressed as a key factor in the evolution of complexity.
The model consists of a population of several species of dimensionless automata inhabiting a geometrically defined environment in which a commodity essential for metabolic requirements (food) appears. Automata can sense properties of their environment, move about it, compete for food, reproduce or combine into coalitions thus forming new and more complex species. Each species is associated with a specific genotype from which the species’ behavioural characteristics (its phenotype) are derived. Complexity and survival efficiency of species increases through coalition formation, an event which occurs when automata are faced with an “undecidable” situation that is resolvable only by forming a new and more complex organization.
Exogenous manipulation of the food distribution pattern and other critical factors produces different environmental conditions resulting in different behaviour patterns of automata and in different evolutionary “pathways.”
Eve-1, the computer program developed to implement this model, accepts a high-level command language which allows for the setting of parameters, definition of initial configurations, and control of output formats. Results of simulation are produced graphically and include various pertinent tables. The program was given a modular hierarchical structure which allows easy generation of new versions incorporating different sets of rules.
The model strives to capture the essence of the evolution of complexity viewed as a general process rather than to describe the evolution of a particular “real” system. In this respect it is not context-specific, and the behaviours which are observable in different runs can receive various interpretation depending on specific identifications. Of these, biological, ecological, and sociological interpretations are the most obvious and the latter, in particular, is stressed.J. M. Kaplan Fund in New Yor
Using Balancing Weights to Target the Treatment Effect on the Treated when Overlap is Poor
Inverse probability weights are commonly used in epidemiology to estimate
causal effects in observational studies. Researchers can typically focus on
either the average treatment effect or the average treatment effect on the
treated with inverse probability weighting estimators. However, when overlap
between the treated and control groups is poor, this can produce extreme
weights that can result in biased estimates and large variances. One
alternative to inverse probability weights are overlap weights, which target
the population with the most overlap on observed characteristics. While
estimates based on overlap weights produce less bias in such contexts, the
causal estimand can be difficult to interpret. One alternative to inverse
probability weights are balancing weights, which directly target imbalances
during the estimation process. Here, we explore whether balancing weights allow
analysts to target the average treatment effect on the treated in cases where
inverse probability weights are biased due to poor overlap. We conduct three
simulation studies and an empirical application. We find that in many cases,
balancing weights allow the analyst to still target the average treatment
effect on the treated even when overlap is poor. We show that while overlap
weights remain a key tool for estimating causal effects, more familiar
estimands can be targeted by using balancing weights instead of inverse
probability weights
Locally Testable Codes and Cayley Graphs
We give two new characterizations of (\F_2-linear) locally testable
error-correcting codes in terms of Cayley graphs over \F_2^h:
\begin{enumerate} \item A locally testable code is equivalent to a Cayley
graph over \F_2^h whose set of generators is significantly larger than
and has no short linear dependencies, but yields a shortest-path metric that
embeds into with constant distortion. This extends and gives a
converse to a result of Khot and Naor (2006), which showed that codes with
large dual distance imply Cayley graphs that have no low-distortion embeddings
into .
\item A locally testable code is equivalent to a Cayley graph over \F_2^h
that has significantly more than eigenvalues near 1, which have no short
linear dependencies among them and which "explain" all of the large
eigenvalues. This extends and gives a converse to a recent construction of
Barak et al. (2012), which showed that locally testable codes imply Cayley
graphs that are small-set expanders but have many large eigenvalues.
\end{enumerate}Comment: 22 page
Using Multiple Outcomes to Improve the Synthetic Control Method
When there are multiple outcome series of interest, Synthetic Control
analyses typically proceed by estimating separate weights for each outcome. In
this paper, we instead propose estimating a common set of weights across
outcomes, by balancing either a vector of all outcomes or an index or average
of them. Under a low-rank factor model, we show that these approaches lead to
lower bias bounds than separate weights, and that averaging leads to further
gains when the number of outcomes grows. We illustrate this via simulation and
in a re-analysis of the impact of the Flint water crisis on educational
outcomes.Comment: 36 pages, 6 figure
Policy Learning with Asymmetric Counterfactual Utilities
Data-driven decision making plays an important role even in high stakes
settings like medicine and public policy. Learning optimal policies from
observed data requires a careful formulation of the utility function whose
expected value is maximized across a population. Although researchers typically
use utilities that depend on observed outcomes alone, in many settings the
decision maker's utility function is more properly characterized by the joint
set of potential outcomes under all actions. For example, the Hippocratic
principle to "do no harm" implies that the cost of causing death to a patient
who would otherwise survive without treatment is greater than the cost of
forgoing life-saving treatment. We consider optimal policy learning with
asymmetric counterfactual utility functions of this form that consider the
joint set of potential outcomes. We show that asymmetric counterfactual
utilities lead to an unidentifiable expected utility function, and so we first
partially identify it. Drawing on statistical decision theory, we then derive
minimax decision rules by minimizing the maximum expected utility loss relative
to different alternative policies. We show that one can learn minimax loss
decision rules from observed data by solving intermediate classification
problems, and establish that the finite sample excess expected utility loss of
this procedure is bounded by the regret of these intermediate classifiers. We
apply this conceptual framework and methodology to the decision about whether
or not to use right heart catheterization for patients with possible pulmonary
hypertension
Aggregation-fragmentation-diffusion model for trail dynamics
We investigate statistical properties of trails formed by a random process incorporating aggregation, fragmentation, and diffusion. In this stochastic process, which takes place in one spatial dimension, two neighboring trails may combine to form a larger one, and also one trail may split into two. In addition, trails move diffusively. The model is defined by two parameters which quantify the fragmentation rate and the fragment size. In the long-time limit, the system reaches a steady state, and our focus is the limiting distribution of trail weights. We find that the density of trail weight has power-law tail P(w)~w-γ for small weight w. We obtain the exponent γ analytically and find that it varies continuously with the two model parameters. The exponent γ can be positive or negative, so that in one range of parameters small-weight trails are abundant and in the complementary range they are rare
Bayesian Safe Policy Learning with Chance Constrained Optimization: Application to Military Security Assessment during the Vietnam War
Algorithmic and data-driven decisions and recommendations are commonly used
in high-stakes decision-making settings such as criminal justice, medicine, and
public policy. We investigate whether it would have been possible to improve a
security assessment algorithm employed during the Vietnam War, using outcomes
measured immediately after its introduction in late 1969. This empirical
application raises several methodological challenges that frequently arise in
high-stakes algorithmic decision-making. First, before implementing a new
algorithm, it is essential to characterize and control the risk of yielding
worse outcomes than the existing algorithm. Second, the existing algorithm is
deterministic, and learning a new algorithm requires transparent extrapolation.
Third, the existing algorithm involves discrete decision tables that are common
but difficult to optimize over.
To address these challenges, we introduce the Average Conditional Risk
(ACRisk), which first quantifies the risk that a new algorithmic policy leads
to worse outcomes for subgroups of individual units and then averages this over
the distribution of subgroups. We also propose a Bayesian policy learning
framework that maximizes the posterior expected value while controlling the
posterior expected ACRisk. This framework separates the estimation of
heterogeneous treatment effects from policy optimization, enabling flexible
estimation of effects and optimization over complex policy classes. We
characterize the resulting chance-constrained optimization problem as a
constrained linear programming problem. Our analysis shows that compared to the
actual algorithm used during the Vietnam War, the learned algorithm assesses
most regions as more secure and emphasizes economic and political factors over
military factors.Comment: 40 pages, 19 figure
- …