14,414 research outputs found

### A Question of Empowerment: Information Technology and Civic Engagement in New Haven, Connecticut

Extravagant claims have been made for the capacity of IT (information technology) to empower citizens and to enhance the capacity of civic organizations. This study of IT use by organizations and agencies in New Haven, Connecticut, 1998-2004, tests these claims, finding that the use of IT by nonprofits is selective, tending to serve agencies patronized by community elites rather than populations in need. In addition, the study finds that single interest groups are far more effective in using IT than more diverse civic and neighborhood groups.This publication is Hauser Center Working Paper No. 30. The Hauser Center Working Paper Series was launched during the summer of 2000. The Series enables the Hauser Center to share with a broad audience important works-in-progress written by Hauser Center scholars and researchers

### Strong approximations of level exceedences related to multiple hypothesis testing

Particularly in genomics, but also in other fields, it has become commonplace
to undertake highly multiple Student's $t$-tests based on relatively small
sample sizes. The literature on this topic is continually expanding, but the
main approaches used to control the family-wise error rate and false discovery
rate are still based on the assumption that the tests are independent. The
independence condition is known to be false at the level of the joint
distributions of the test statistics, but that does not necessarily mean, for
the small significance levels involved in highly multiple hypothesis testing,
that the assumption leads to major errors. In this paper, we give conditions
under which the assumption of independence is valid. Specifically, we derive a
strong approximation that closely links the level exceedences of a dependent
``studentized process'' to those of a process of independent random variables.
Via this connection, it can be seen that in high-dimensional, low sample-size
cases, provided the sample size diverges faster than the logarithm of the
number of tests, the assumption of independent $t$-tests is often justified.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ220 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

### Modeling the variability of rankings

For better or for worse, rankings of institutions, such as universities,
schools and hospitals, play an important role today in conveying information
about relative performance. They inform policy decisions and budgets, and are
often reported in the media. While overall rankings can vary markedly over
relatively short time periods, it is not unusual to find that the ranks of a
small number of "highly performing" institutions remain fixed, even when the
data on which the rankings are based are extensively revised, and even when a
large number of new institutions are added to the competition. In the present
paper, we endeavor to model this phenomenon. In particular, we interpret as a
random variable the value of the attribute on which the ranking should ideally
be based. More precisely, if $p$ items are to be ranked then the true, but
unobserved, attributes are taken to be values of $p$ independent and
identically distributed variates. However, each attribute value is observed
only with noise, and via a sample of size roughly equal to $n$, say. These
noisy approximations to the true attributes are the quantities that are
actually ranked. We show that, if the distribution of the true attributes is
light-tailed (e.g., normal or exponential) then the number of institutions
whose ranking is correct, even after recalculation using new data and even
after many new institutions are added, is essentially fixed. Formally, $p$ is
taken to be of order $n^C$ for any fixed $C>0$, and the number of institutions
whose ranking is reliable depends very little on $p$.Comment: Published in at http://dx.doi.org/10.1214/10-AOS794 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org

### Nonparametric estimation of mean-squared prediction error in nested-error regression models

Nested-error regression models are widely used for analyzing clustered data.
For example, they are often applied to two-stage sample surveys, and in biology
and econometrics. Prediction is usually the main goal of such analyses, and
mean-squared prediction error is the main way in which prediction performance
is measured. In this paper we suggest a new approach to estimating mean-squared
prediction error. We introduce a matched-moment, double-bootstrap algorithm,
enabling the notorious underestimation of the naive mean-squared error
estimator to be substantially reduced. Our approach does not require specific
assumptions about the distributions of errors. Additionally, it is simple and
easy to apply. This is achieved through using Monte Carlo simulation to
implicitly develop formulae which, in a more conventional approach, would be
derived laboriously by mathematical arguments.Supported in part by NSF Grant SES-03-18184

### Nonparametric estimation of a point-spread function in multivariate problems

The removal of blur from a signal, in the presence of noise, is readily
accomplished if the blur can be described in precise mathematical terms.
However, there is growing interest in problems where the extent of blur is
known only approximately, for example in terms of a blur function which depends
on unknown parameters that must be computed from data. More challenging still
is the case where no parametric assumptions are made about the blur function.
There has been a limited amount of work in this setting, but it invariably
relies on iterative methods, sometimes under assumptions that are
mathematically convenient but physically unrealistic (e.g., that the operator
defined by the blur function has an integrable inverse). In this paper we
suggest a direct, noniterative approach to nonparametric, blind restoration of
a signal. Our method is based on a new, ridge-based method for deconvolution,
and requires only mild restrictions on the blur function. We show that the
convergence rate of the method is close to optimal, from some viewpoints, and
demonstrate its practical performance by applying it to real images.Comment: Published in at http://dx.doi.org/10.1214/009053606000001442 the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org

### Assessing extrema of empirical principal component functions

The difficulties of estimating and representing the distributions of
functional data mean that principal component methods play a substantially
greater role in functional data analysis than in more conventional
finite-dimensional settings. Local maxima and minima in principal component
functions are of direct importance; they indicate places in the domain of a
random function where influence on the function value tends to be relatively
strong but of opposite sign. We explore statistical properties of the
relationship between extrema of empirical principal component functions, and
their counterparts for the true principal component functions. It is shown that
empirical principal component funcions have relatively little trouble capturing
conventional extrema, but can experience difficulty distinguishing a
``shoulder'' in a curve from a small bump. For example, when the true principal
component function has a shoulder, the probability that the empirical principal
component function has instead a bump is approximately equal to 1/2. We suggest
and describe the performance of bootstrap methods for assessing the strength of
extrema. It is shown that the subsample bootstrap is more effective than the
standard bootstrap in this regard. A ``bootstrap likelihood'' is proposed for
measuring extremum strength. Exploratory numerical methods are suggested.Comment: Published at http://dx.doi.org/10.1214/009053606000000371 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org

### Robustness of multiple testing procedures against dependence

An important aspect of multiple hypothesis testing is controlling the
significance level, or the level of Type I error. When the test statistics are
not independent it can be particularly challenging to deal with this problem,
without resorting to very conservative procedures. In this paper we show that,
in the context of contemporary multiple testing problems, where the number of
tests is often very large, the difficulties caused by dependence are less
serious than in classical cases. This is particularly true when the null
distributions of test statistics are relatively light-tailed, for example, when
they can be based on Normal or Student's $t$ approximations. There, if the test
statistics can fairly be viewed as being generated by a linear process, an
analysis founded on the incorrect assumption of independence is asymptotically
correct as the number of hypotheses diverges. In particular, the point process
representing the null distribution of the indices at which statistically
significant test results occur is approximately Poisson, just as in the case of
independence. The Poisson process also has the same mean as in the independence
case, and of course exhibits no clustering of false discoveries. However, this
result can fail if the null distributions are particularly heavy-tailed. There
clusters of statistically significant results can occur, even when the null
hypothesis is correct. We give an intuitive explanation for these disparate
properties in light- and heavy-tailed cases, and provide rigorous theory
underpinning the intuition.Comment: Published in at http://dx.doi.org/10.1214/07-AOS557 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org

### Nonparametric regression with homogeneous group testing data

We introduce new nonparametric predictors for homogeneous pooled data in the
context of group testing for rare abnormalities and show that they achieve
optimal rates of convergence. In particular, when the level of pooling is
moderate, then despite the cost savings, the method enjoys the same convergence
rate as in the case of no pooling. In the setting of "over-pooling" the
convergence rate differs from that of an optimal estimator by no more than a
logarithmic factor. Our approach improves on the random-pooling nonparametric
predictor, which is currently the only nonparametric method available, unless
there is no pooling, in which case the two approaches are identical.Comment: Published in at http://dx.doi.org/10.1214/11-AOS952 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org

- …