17 research outputs found

    Some Lessons From 50 Years of Multiarm Public Policy Experiments

    Full text link
    In this article, we explore the reasons why multiarm trials have been conducted and the design and analysis issues they involve. We point to three fundamental reasons for such designs: (1) Multiarm designs allow the estimation of “response surfaces”—that is, the variation in response to an intervention across a range of one or more continuous policy parameters. (2) Multiarm designs are an efficient way to test multiple policy approaches to the same social problem simultaneously, either to compare the effects of the different approaches or to estimate the effect of each separately. (3) Multiarm designs may allow for the estimation of the separate and combined effects of discrete program components. We illustrate each of these objectives with examples from the history of public policy experimentation over the past 50 years and discuss some design and analysis issues raised by each, including sample allocation, statistical power, multiple comparisons, and alignment of analysis with goals of the evaluation. </jats:p

    Embedding a Proof-of-Concept Test in an At-Scale National Policy Experiment: Greater Policy Learning But at What Cost to Statistical Power? The Social Security Administration’s Benefit Offset National Demonstration (BOND)

    Full text link
    A randomized experiment that measures the impact of a social policy in a sample of the population reveals whether the policy will work on average with universal application. An experiment that includes only the subset of the population that volunteers for the intervention generates narrower “proof-of-concept” evidence of whether the policy can work for motivated individuals. Both forms of learning carry value, yet evaluations rarely combine the two designs. The U.S. Social Security Administration conducted an exception, the Benefit Offset National Demonstration (BOND). This article uses BOND to examine the statistical power implications and potential gains in policy learning—relative to costs—from combining volunteer and population-representative experiments. It finds that minimum detectable effects of volunteer experiments rise little when one adds a population-representative experiment, but those of a population-representative experiment double or quadruple with the addition of a volunteer experiment. </jats:p
    corecore