2,617 research outputs found
A central limit theorem for the Benjamini-Hochberg false discovery proportion under a factor model
The Benjamini-Hochberg (BH) procedure remains widely popular despite having
limited theoretical guarantees in the commonly encountered scenario of
correlated test statistics. Of particular concern is the possibility that the
method could exhibit bursty behavior, meaning that it might typically yield no
false discoveries while occasionally yielding both a large number of false
discoveries and a false discovery proportion (FDP) that far exceeds its own
well controlled mean. In this paper, we investigate which test statistic
correlation structures lead to bursty behavior and which ones lead to well
controlled FDPs. To this end, we develop a central limit theorem for the FDP in
a multiple testing setup where the test statistic correlations can be either
short-range or long-range as well as either weak or strong. The theorem and our
simulations from a data-driven factor model suggest that the BH procedure
exhibits severe burstiness when the test statistics have many strong,
long-range correlations, but does not otherwise.Comment: Main changes in version 2: i) restated Corollary 1 in a way that is
clearer and easier to use, ii) removed a regularity condition for our
theorems (in particular we removed Condition 2 from version 1), and iii) we
added a couple of remarks (namely, Remark 1 and 6 in version 2). Throughout
the text we also fixed typos, improved clarity, and added a some additional
commentary and reference
Tie-breaker designs provide more efficient kernel estimates than regression discontinuity designs
Tie-breaker experimental designs are hybrids of Randomized Controlled Trials
(RCTs) and Regression Discontinuity Designs (RDDs) in which subjects with
moderate scores are placed in an RCT while subjects with extreme scores are
deterministically assigned to the treatment or control group. The tie-breaker
design (TBD) has practical advantages over the RCT in settings where it is
unfair or uneconomical to deny the treatment to the most deserving recipients.
Meanwhile, the TBD has statistical benefits due to randomization over the RDD.
In this paper we discuss and quantify the statistical benefits of the TBD
compared to the RDD. If the goal is estimation of the average treatment effect
or the treatment at more than one score value, the statistical benefits of
using a TBD over an RDD are apparent. If the goal is estimation of the average
treatment effect at merely one score value, which is typically done by fitting
local linear regressions, about 2.8 times more subjects are needed for an RDD
in order to achieve the same asymptotic mean squared error. We further
demonstrate using both theoretical results and simulations from the Angrist and
Lavy (1999) classroom size dataset, that larger experimental radii choices for
the TBD lead to greater statistical efficiency.Comment: This version is quite different than version 1. We have added an
analysis when the bandwidth is shrinking with the sample size. We have also
added a discussion of other statistical advantages of a TBD compared to an
RD
The UN War Crimes Commission and International Law: Revisiting World War II Precedents and Practice
The history of international legal institutions has largely ignored the early activities of the United Nations, specifically of the UN War Crimes Commission (UNWCC). Based on an assessment of its work and with access to new archival evidence, contemporary international legal institutional design could benefit significantly from revisiting the commission’s achievements, particularly the principle of complementarity identified in the Rome Statute of the International Criminal Court, and support for domestic tribunals for war crimes and crimes against humanity. The article begins by examining the history, multilateral basis for, and practical activities of the commission. Subsequently, it assesses its contemporary relevance. Finally, it analyses—with reference to modern literature on complementarity—the degree to which the commission’s wartime model provides positive examples of implementation of the principle that could be replicated today, with particular reference to domestic capacity-building and international coordination
Comparing Difficulty Sequence of Keymath-revised Test Items for the Norming Sample and for Students Referred for Learning Disability Assessmentďż˝
Applied Behavioral Studie
Biases in estimates of air pollution impacts: the role of omitted variables and measurement errors
Observational studies often use linear regression to assess the effect of
ambient air pollution on outcomes of interest, such as human health outcomes or
crop yields. Yet pollution datasets are typically noisy and include only a
subset of potentially relevant pollutants, giving rise to both measurement
error bias (MEB) and omitted variable bias (OVB). While it is well understood
that these biases exist, less is understood about whether these biases tend to
be positive or negative, even though it is sometimes falsely claimed that
measurement error simply biases regression coefficient estimates towards zero.
In this paper, we show that more can be said about the direction of these
biases under the realistic assumptions that the concentrations of different
types of air pollutants are positively correlated with each other and that each
type of pollutant has a nonpositive association with the outcome variable. In
particular, we demonstrate both theoretically and using simulations that under
these two assumptions, the OVB will typically be negative and that more often
than not the MEB for null pollutants or for pollutants that are perfectly
measured will be negative. We also provide precise conditions, which are
consistent with the assumptions, under which we prove that the biases are
guaranteed to be negative. While the discussion in this paper is motivated by
studies assessing the effect of air pollutants on crop yields, the findings are
also relevant to regression-based studies assessing the effect of air
pollutants on human health outcomes
Addressing contingency in algorithmic (mis)information classification: Toward a responsible machine learning agenda
Machine learning (ML) enabled classification models are becoming increasingly
popular for tackling the sheer volume and speed of online misinformation and
other content that could be identified as harmful. In building these models,
data scientists need to take a stance on the legitimacy, authoritativeness and
objectivity of the sources of ``truth" used for model training and testing.
This has political, ethical and epistemic implications which are rarely
addressed in technical papers. Despite (and due to) their reported high
accuracy and performance, ML-driven moderation systems have the potential to
shape online public debate and create downstream negative impacts such as undue
censorship and the reinforcing of false beliefs. Using collaborative
ethnography and theoretical insights from social studies of science and
expertise, we offer a critical analysis of the process of building ML models
for (mis)information classification: we identify a series of algorithmic
contingencies--key moments during model development that could lead to
different future outcomes, uncertainty and harmful effects as these tools are
deployed by social media platforms. We conclude by offering a tentative path
toward reflexive and responsible development of ML tools for moderating
misinformation and other harmful content online.Comment: Andr\'es Dom\'inguez Hern\'andez, Richard Owen, Dan Saattrup Nielsen
and Ryan McConville. 2023. Addressing contingency in algorithmic
(mis)information classification: Toward a responsible machine learning
agenda. Accepted in 2023 ACM Conference on Fairness, Accountability, and
Transparency (FAccT '23), June 12-15, 2023, Chicago, United States of
America. ACM, New York, NY, USA, 16 page
- …