28,184 research outputs found
One-step Estimation of Networked Population Size: Respondent-Driven Capture-Recapture with Anonymity
Population size estimates for hidden and hard-to-reach populations are
particularly important when members are known to suffer from disproportion
health issues or to pose health risks to the larger ambient population in which
they are embedded. Efforts to derive size estimates are often frustrated by a
range of factors that preclude conventional survey strategies, including social
stigma associated with group membership or members' involvement in illegal
activities.
This paper extends prior research on the problem of network population size
estimation, building on established survey/sampling methodologies commonly used
with hard-to-reach groups. Three novel one-step, network-based population size
estimators are presented, to be used in the context of uniform random sampling,
respondent-driven sampling, and when networks exhibit significant clustering
effects. Provably sufficient conditions for the consistency of these estimators
(in large configuration networks) are given. Simulation experiments across a
wide range of synthetic network topologies validate the performance of the
estimators, which are seen to perform well on a real-world location-based
social networking data set with significant clustering. Finally, the proposed
schemes are extended to allow them to be used in settings where participant
anonymity is required. Systematic experiments show favorable tradeoffs between
anonymity guarantees and estimator performance.
Taken together, we demonstrate that reasonable population estimates can be
derived from anonymous respondent driven samples of 250-750 individuals, within
ambient populations of 5,000-40,000. The method thus represents a novel and
cost-effective means for health planners and those agencies concerned with
health and disease surveillance to estimate the size of hidden populations.
Limitations and future work are discussed in the concluding section
Sparse Median Graphs Estimation in a High Dimensional Semiparametric Model
In this manuscript a unified framework for conducting inference on complex
aggregated data in high dimensional settings is proposed. The data are assumed
to be a collection of multiple non-Gaussian realizations with underlying
undirected graphical structures. Utilizing the concept of median graphs in
summarizing the commonality across these graphical structures, a novel
semiparametric approach to modeling such complex aggregated data is provided
along with robust estimation of the median graph, which is assumed to be
sparse. The estimator is proved to be consistent in graph recovery and an upper
bound on the rate of convergence is given. Experiments on both synthetic and
real datasets are conducted to illustrate the empirical usefulness of the
proposed models and methods
A simple method to identify significant effects in unreplicated two-level factorial designs
This article proposes a generalization and improvement on the method of Lenth (1989). The problem is solved by fixing outliers in highly contaminated samples. To do this a scale robust estimator is obtained and its performance is analyzed using computer simulations. The method is extremely simple to use and leads to the same results as the more complex one proposed by Box and Meyer (1986)
Manifold embedding for curve registration
We focus on the problem of finding a good representative of a sample of
random curves warped from a common pattern f. We first prove that such a
problem can be moved onto a manifold framework. Then, we propose an estimation
of the common pattern f based on an approximated geodesic distance on a
suitable manifold. We then compare the proposed method to more classical
methods
Bias reduction in traceroute sampling: towards a more accurate map of the Internet
Traceroute sampling is an important technique in exploring the internet
router graph and the autonomous system graph. Although it is one of the primary
techniques used in calculating statistics about the internet, it can introduce
bias that corrupts these estimates. This paper reports on a theoretical and
experimental investigation of a new technique to reduce the bias of traceroute
sampling when estimating the degree distribution. We develop a new estimator
for the degree of a node in a traceroute-sampled graph; validate the estimator
theoretically in Erdos-Renyi graphs and, through computer experiments, for a
wider range of graphs; and apply it to produce a new picture of the degree
distribution of the autonomous system graph.Comment: 12 pages, 3 figure
- …