129,421 research outputs found
Agnostic notes on regression adjustments to experimental data: Reexamining Freedman's critique
Freedman [Adv. in Appl. Math. 40 (2008) 180-193; Ann. Appl. Stat. 2 (2008)
176-196] critiqued ordinary least squares regression adjustment of estimated
treatment effects in randomized experiments, using Neyman's model for
randomization inference. Contrary to conventional wisdom, he argued that
adjustment can lead to worsened asymptotic precision, invalid measures of
precision, and small-sample bias. This paper shows that in sufficiently large
samples, those problems are either minor or easily fixed. OLS adjustment cannot
hurt asymptotic precision when a full set of treatment-covariate interactions
is included. Asymptotically valid confidence intervals can be constructed with
the Huber-White sandwich standard error estimator. Checks on the asymptotic
approximations are illustrated with data from Angrist, Lang, and Oreopoulos's
[Am. Econ. J.: Appl. Econ. 1:1 (2009) 136--163] evaluation of strategies to
improve college students' achievement. The strongest reasons to support
Freedman's preference for unadjusted estimates are transparency and the dangers
of specification search.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS583 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
ANALYSIS OF REPEATED MEASURES DATA
Data with repeated measures occur frequently in agricultural research. This paper is a brief overview of statistical methods for repeated measures data. Statistical analysis of repeated measures data requires special attention due to the correlation structure, which may render standard analysis of variance techniques invalid. For balanced data, multivariate analysis of variance methods can be employed and adjustments can be applied to univariate methods, as means of accounting for the correlation structure. But these analysis of variance methods do not apply readily with unbalanced data, and they overlook the regression on time. Regression curves for treatment groups can be obtained by fitting a curve to each experimental unit; and then averaging the coefficients over the units. Treatment groups can be compared by applying univariate and multivariate methods to the group means of the coefficients. This approach does not require knowledge of the correlation structure of the repeated measures, and an approximate version of it can be applied with unbalanced data
High-dimensional regression adjustments in randomized experiments
We study the problem of treatment effect estimation in randomized experiments
with high-dimensional covariate information, and show that essentially any
risk-consistent regression adjustment can be used to obtain efficient estimates
of the average treatment effect. Our results considerably extend the range of
settings where high-dimensional regression adjustments are guaranteed to
provide valid inference about the population average treatment effect. We then
propose cross-estimation, a simple method for obtaining finite-sample-unbiased
treatment effect estimates that leverages high-dimensional regression
adjustments. Our method can be used when the regression model is estimated
using the lasso, the elastic net, subset selection, etc. Finally, we extend our
analysis to allow for adaptive specification search via cross-validation, and
flexible non-parametric regression adjustments with machine learning methods
such as random forests or neural networks.Comment: To appear in the Proceedings of the National Academy of Sciences. The
present draft does not reflect final copyediting by the PNAS staf
When Should You Adjust Standard Errors for Clustering?
In empirical work in economics it is common to report standard errors that
account for clustering of units. Typically, the motivation given for the
clustering adjustments is that unobserved components in outcomes for units
within clusters are correlated. However, because correlation may occur across
more than one dimension, this motivation makes it difficult to justify why
researchers use clustering in some dimensions, such as geographic, but not
others, such as age cohorts or gender. It also makes it difficult to explain
why one should not cluster with data from a randomized experiment. In this
paper, we argue that clustering is in essence a design problem, either a
sampling design or an experimental design issue. It is a sampling design issue
if sampling follows a two stage process where in the first stage, a subset of
clusters were sampled randomly from a population of clusters, while in the
second stage, units were sampled randomly from the sampled clusters. In this
case the clustering adjustment is justified by the fact that there are clusters
in the population that we do not see in the sample. Clustering is an
experimental design issue if the assignment is correlated within the clusters.
We take the view that this second perspective best fits the typical setting in
economics where clustering adjustments are used. This perspective allows us to
shed new light on three questions: (i) when should one adjust the standard
errors for clustering, (ii) when is the conventional adjustment for clustering
appropriate, and (iii) when does the conventional adjustment of the standard
errors matter
Guidelines for physical weed control research: flame weeding, weed harrowing and intra-row cultivation
A prerequisite for good research is the use of appropriate methodology. In order to aggregate sound research methodology, this paper presents some tentative guidelines for physical weed control research in general, and flame weeding, weed harrowing and intra-row cultivation in particular. Issues include the adjustment and use of mechanical weeders and other equipment, the recording of impact factors that affect weeding performance, methods to assess effectiveness, the layout of treatment plots, and the conceptual models underlying the experimental designs (e.g. factorial comparison, dose response).
First of all, the research aims need to be clearly defined, an appropriate experimental design produced and statistical methods chosen accordingly. Suggestions on how to do this are given. For assessments, quantitative measures would be ideal, but as they require more resources, visual classification may in some cases be more feasible. The timing of assessment affects the results and their interpretation.
When describing the weeds and crops, one should list the crops and the most abundantly present weed species involved, giving their density and growth stages at the time of treatment. The location of the experimental field, soil type, soil moisture and amount of fertilization should be given, as well as weather conditions at the time of treatment.
The researcher should describe the weed control equipment and adjustments accurately, preferably according to the prevailing practice within the discipline. Things to record are e.g. gas pressure, burner properties, burner cover dimensions and LPG consumption in flame weeding; speed, angle of tines, number of passes and direction in weed harrowing.
The authors hope this paper will increase comparability among experiments, help less experienced scientists to prevent mistakes and essential omissions, and foster the advance of knowledge on non-chemical weed management
Regression adjustments for estimating the global treatment effect in experiments with interference
Standard estimators of the global average treatment effect can be biased in
the presence of interference. This paper proposes regression adjustment
estimators for removing bias due to interference in Bernoulli randomized
experiments. We use a fitted model to predict the counterfactual outcomes of
global control and global treatment. Our work differs from standard regression
adjustments in that the adjustment variables are constructed from functions of
the treatment assignment vector, and that we allow the researcher to use a
collection of any functions correlated with the response, turning the problem
of detecting interference into a feature engineering problem. We characterize
the distribution of the proposed estimator in a linear model setting and
connect the results to the standard theory of regression adjustments under
SUTVA. We then propose an estimator that allows for flexible machine learning
estimators to be used for fitting a nonlinear interference functional form. We
propose conducting statistical inference via bootstrap and resampling methods,
which allow us to sidestep the complicated dependences implied by interference
and instead rely on empirical covariance structures. Such variance estimation
relies on an exogeneity assumption akin to the standard unconfoundedness
assumption invoked in observational studies. In simulation experiments, our
methods are better at debiasing estimates than existing inverse propensity
weighted estimators based on neighborhood exposure modeling. We use our method
to reanalyze an experiment concerning weather insurance adoption conducted on a
collection of villages in rural China.Comment: 38 pages, 7 figure
Development and Validation of a Rule-based Time Series Complexity Scoring Technique to Support Design of Adaptive Forecasting DSS
Evidence from forecasting research gives reason to believe that understanding time series complexity can enable design of adaptive forecasting decision support systems (FDSSs) to positively support forecasting behaviors and accuracy of outcomes. Yet, such FDSS design capabilities have not been formally explored because there exists no systematic approach to identifying series complexity. This study describes the development and validation of a rule-based complexity scoring technique (CST) that generates a complexity score for time series using 12 rules that rely on 14 features of series. The rule-based schema was developed on 74 series and validated on 52 holdback series using well-accepted forecasting methods as benchmarks. A supporting experimental validation was conducted with 14 participants who generated 336 structured judgmental forecasts for sets of series classified as simple or complex by the CST. Benchmark comparisons validated the CST by confirming, as hypothesized, that forecasting accuracy was lower for series scored by the technique as complex when compared to the accuracy of those scored as simple. The study concludes with a comprehensive framework for design of FDSS that can integrate the CST to adaptively support forecasters under varied conditions of series complexity. The framework is founded on the concepts of restrictiveness and guidance and offers specific recommendations on how these elements can be built in FDSS to support complexity
Effective forecasting for supply-chain planning: an empirical evaluation and strategies for improvement
Demand forecasting is a crucial aspect of the planning process in supply-chain companies. The most common approach to forecasting demand in these companies involves the use of a simple univariate statistical method to produce a forecast and the subsequent judgmental adjustment of this by the company's demand planners to take into account market intelligence relating to any exceptional circumstances expected over the planning horizon. Based on four company case studies, which included collecting more than 12,000 forecasts and outcomes, this paper examines: i) the extent to which the judgmental adjustments led to improvements in accuracy, ii) the extent to which the adjustments were biased and inefficient, iii) the circumstances where adjustments were detrimental or beneficial, and iv) methods that could lead to greater levels of accuracy. It was found that the judgmentally adjusted forecasts were both biased and inefficient. In particular, market intelligence that was expected to have a positive impact on demand was used far less effectively than intelligence suggesting a negative impact. The paper goes on to propose a set of improvements that could be applied to the forecasting processes in the companies and to the forecasting software that is used in these processes
- …