Search CORE

28,184 research outputs found

One-step Estimation of Networked Population Size: Respondent-Driven Capture-Recapture with Anonymity

Author: Dombrowski Kirk
Fellows Ian
Khan Bilal
Lee Hsuan-Wei
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 11/10/2017
Field of study

Population size estimates for hidden and hard-to-reach populations are particularly important when members are known to suffer from disproportion health issues or to pose health risks to the larger ambient population in which they are embedded. Efforts to derive size estimates are often frustrated by a range of factors that preclude conventional survey strategies, including social stigma associated with group membership or members' involvement in illegal activities. This paper extends prior research on the problem of network population size estimation, building on established survey/sampling methodologies commonly used with hard-to-reach groups. Three novel one-step, network-based population size estimators are presented, to be used in the context of uniform random sampling, respondent-driven sampling, and when networks exhibit significant clustering effects. Provably sufficient conditions for the consistency of these estimators (in large configuration networks) are given. Simulation experiments across a wide range of synthetic network topologies validate the performance of the estimators, which are seen to perform well on a real-world location-based social networking data set with significant clustering. Finally, the proposed schemes are extended to allow them to be used in settings where participant anonymity is required. Systematic experiments show favorable tradeoffs between anonymity guarantees and estimator performance. Taken together, we demonstrate that reasonable population estimates can be derived from anonymous respondent driven samples of 250-750 individuals, within ambient populations of 5,000-40,000. The method thus represents a novel and cost-effective means for health planners and those agencies concerned with health and disease surveillance to estimate the size of hidden populations. Limitations and future work are discussed in the concluding section

arXiv.org e-Print Archive

DigitalCommons@University of Nebraska

Directory of Open Access Journals

Sparse Median Graphs Estimation in a High Dimensional Semiparametric Model

Author: Caffo Brian
Han Fang
Liu Han
Publication venue
Publication date: 11/10/2013
Field of study

In this manuscript a unified framework for conducting inference on complex aggregated data in high dimensional settings is proposed. The data are assumed to be a collection of multiple non-Gaussian realizations with underlying undirected graphical structures. Utilizing the concept of median graphs in summarizing the commonality across these graphical structures, a novel semiparametric approach to modeling such complex aggregated data is provided along with robust estimation of the median graph, which is assumed to be sparse. The estimator is proved to be consistent in graph recovery and an upper bound on the rate of convergence is given. Experiments on both synthetic and real datasets are conducted to illustrate the empirical usefulness of the proposed models and methods

arXiv.org e-Print Archive

Collection Of Biostatistics Research Archive

A simple method to identify significant effects in unreplicated two-level factorial designs

Author: Juan Jesús
Peña Daniel
Publication venue
Publication date: 01/02/1992
Field of study

This article proposes a generalization and improvement on the method of Lenth (1989). The problem is solved by fixing outliers in highly contaminated samples. To do this a scale robust estimator is obtained and its performance is analyzed using computer simulations. The method is extremely simple to use and leads to the same results as the more complex one proposed by Box and Meyer (1986)

Universidad Carlos III de Madrid e-Archivo

Manifold embedding for curve registration

Author: Dimeglio Chloé
Loubes Jean-Michel
Maza Elie
Publication venue
Publication date: 01/01/2011
Field of study

We focus on the problem of finding a good representative of a sample of random curves warped from a common pattern f. We first prove that such a problem can be moved onto a manifold framework. Then, we propose an estimation of the common pattern f based on an approximated geodesic distance on a suitable manifold. We then compare the proposed method to more classical methods

arXiv.org e-Print Archive

Scientific Publications of the University of Toulouse II Le Mirail

HAL-INSA Toulouse

Hal-Diderot

Bias reduction in traceroute sampling: towards a more accurate map of the Internet

Author: A. Clauset
A. Lakhina
A.-L. Barabási
A.S. Klovdahl
D. Achlioptas
D.J. Watts
G. Pólya
J. Leguay
J.-J. Pansiot
J.-L. Guillaume
J.I. Pickands
L. Dall’Asta
M. Faloutsos
M. Penrose
M.J. Salganik
O. Frank
P. Erdős
P. Mahadevan
T. Petermann
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Traceroute sampling is an important technique in exploring the internet router graph and the autonomous system graph. Although it is one of the primary techniques used in calculating statistics about the internet, it can introduce bias that corrupts these estimates. This paper reports on a theoretical and experimental investigation of a new technique to reduce the bias of traceroute sampling when estimating the degree distribution. We develop a new estimator for the degree of a node in a traceroute-sampled graph; validate the estimator theoretically in Erdos-Renyi graphs and, through computer experiments, for a wider range of graphs; and apply it to produce a new picture of the degree distribution of the autonomous system graph.Comment: 12 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Tilburg University Repository