14,844 research outputs found

### On the history and use of some standard statistical models

This paper tries to tell the story of the general linear model, which saw the
light of day 200 years ago, and the assumptions underlying it. We distinguish
three principal stages (ignoring earlier more isolated instances). The model
was first proposed in the context of astronomical and geodesic observations,
where the main source of variation was observational error. This was the main
use of the model during the 19th century. In the 1920's it was developed in a
new direction by R.A. Fisher whose principal applications were in agriculture
and biology. Finally, beginning in the 1930's and 40's it became an important
tool for the social sciences. As new areas of applications were added, the
assumptions underlying the model tended to become more questionable, and the
resulting statistical techniques more prone to misuse.Comment: Published in at http://dx.doi.org/10.1214/193940307000000419 the IMS
Collections (http://www.imstat.org/publications/imscollections.htm) by the
Institute of Mathematical Statistics (http://www.imstat.org

### Generalizations of the Familywise Error Rate

Consider the problem of simultaneously testing null hypotheses H_1,...,H_s.
The usual approach to dealing with the multiplicity problem is to restrict
attention to procedures that control the familywise error rate (FWER), the
probability of even one false rejection. In many applications, particularly if
s is large, one might be willing to tolerate more than one false rejection
provided the number of such cases is controlled, thereby increasing the ability
of the procedure to detect false null hypotheses. This suggests replacing
control of the FWER by controlling the probability of k or more false
rejections, which we call the k-FWER. We derive both single-step and stepdown
procedures that control the k-FWER, without making any assumptions concerning
the dependence structure of the p-values of the individual tests. In
particular, we derive a stepdown procedure that is quite simple to apply, and
prove that it cannot be improved without violation of control of the k-FWER. We
also consider the false discovery proportion (FDP) defined by the number of
false rejections divided by the total number of rejections (defined to be 0 if
there are no rejections). The false discovery rate proposed by Benjamini and
Hochberg [J. Roy. Statist. Soc. Ser. B 57 (1995) 289-300] controls E(FDP).
Here, we construct methods such that, for any \gamma and \alpha,
P{FDP>\gamma}\le\alpha. Two stepdown methods are proposed. The first holds
under mild conditions on the dependence structure of p-values, while the second
is more conservative but holds without any dependence assumptions.Comment: Published at http://dx.doi.org/10.1214/009053605000000084 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org

### Fluid thrust control system

A pure fluid thrust control system is described for a pump-fed, regeneratively cooled liquid propellant rocket engine. A proportional fluid amplifier and a bistable fluid amplifier control overshoot in the starting of the engine and take it to a predetermined thrust. An ejector type pump is provided in the line between the liquid hydrogen rocket nozzle heat exchanger and the turbine driving the fuel pump to aid in bringing the fluid at this point back into the regular system when it is not bypassed. The thrust control system is intended to function in environments too severe for mechanical controls

### On Optimality of Stepdown and Stepup Multiple Test Procedures

Consider the multiple testing problem of testing k null hypotheses, where the
unknown family of distributions is assumed to satisfy a certain monotonicity
assumption. Attention is restricted to procedures that control the familywise
error rate in the strong sense and which satisfy a monotonicity condition.
Under these assumptions, we prove certain maximin optimality results for some
well-known stepdown and stepup procedures.Comment: Published at http://dx.doi.org/10.1214/009053605000000066 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org

### A simple minimax estimator for quantum states

Quantum tomography requires repeated measurements of many copies of the
physical system, all prepared by a source in the unknown state. In the limit of
very many copies measured, the often-used maximum-likelihood (ML) method for
converting the gathered data into an estimate of the state works very well. For
smaller data sets, however, it often suffers from problems of rank deficiency
in the estimated state. For many systems of relevance for quantum information
processing, the preparation of a very large number of copies of the same
quantum state is still a technological challenge, which motivates us to look
for estimation strategies that perform well even when there is not much data.
In this article, we review the concept of minimax state estimation, and use
minimax ideas to construct a simple estimator for quantum states. We
demonstrate that, for the case of tomography of a single qubit, our estimator
significantly outperforms the ML estimator for small number of copies of the
state measured. Our estimator is always full-rank, and furthermore, has a
natural dependence on the number of copies measured, which is missing in the ML
estimator.Comment: 26 pages, 3 figures. v2 contains minor improvements to the text, and
an additional appendix on symmetric measurement

### Gibrat's law for cities: uniformly most powerful unbiased test of the Pareto against the lognormal

We address the general problem of testing a power law distribution versus a
log-normal distribution in statistical data. This general problem is
illustrated on the distribution of the 2000 US census of city sizes. We provide
definitive results to close the debate between Eeckhout (2004, 2009) and Levy
(2009) on the validity of Zipf's law, which is the special Pareto law with tail
exponent 1, to describe the tail of the distribution of U.S. city sizes.
Because the origin of the disagreement between Eeckhout and Levy stems from the
limited power of their tests, we perform the {\em uniformly most powerful
unbiased test} for the null hypothesis of the Pareto distribution against the
lognormal. The $p$-value and Hill's estimator as a function of city size lower
threshold confirm indubitably that the size distribution of the 1000 largest
cities or so, which include more than half of the total U.S. population, is
Pareto, but we rule out that the tail exponent, estimated to be $1.4 \pm 0.1$,
is equal to 1. For larger ranks, the $p$-value becomes very small and Hill's
estimator decays systematically with decreasing ranks, qualifying the lognormal
distribution as the better model for the set of smaller cities. These two
results reconcile the opposite views of Eeckhout (2004, 2009) and Levy (2009).
We explain how Gibrat's law of proportional growth underpins both the Pareto
and lognormal distributions and stress the key ingredient at the origin of
their difference in standard stochastic growth models of cities
\cite{Gabaix99,Eeckhout2004}.Comment: 7 pages + 2 figure

- …