Search CORE

38 research outputs found

Modeling social networks from sampled data

Author: Gile Krista J.
Handcock Mark S.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

Network models are widely used to represent relational information among interacting units and the structural implications of these relations. Recently, social network studies have focused a great deal of attention on random graph models of networks whose nodes represent individual social actors and whose edges represent a specified relationship between the actors. Most inference for social network models assumes that the presence or absence of all possible links is observed, that the information is completely reliable, and that there are no measurement (e.g., recording) errors. This is clearly not true in practice, as much network data is collected though sample surveys. In addition even if a census of a population is attempted, individuals and links between individuals are missed (i.e., do not appear in the recorded data). In this paper we develop the conceptual and computational theory for inference based on sampled network information. We first review forms of network sampling designs used in practice. We consider inference from the likelihood framework, and develop a typology of network data that reflects their treatment within this frame. We then develop inference for social network models based on information from adaptive network designs. We motivate and illustrate these ideas by analyzing the effect of link-tracing sampling designs on a collaboration network.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS221 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

PubMed Central

eScholarship - University of California

On the Concept of Snowball Sampling

Author: Gile Krista J.
Handcock Mark S.
Publication venue
Publication date: 01/08/2011
Field of study

This brief comment reflects on the historical and current uses of the term "snowball sampling."Comment: 5 pages, 0 figures. To appear in Sociological Methodolog

arXiv.org e-Print Archive

eScholarship - University of California

Respondent-Driven Sampling: An Assessment of Current Methodology

Author: Gile Krista J.
Handcock Mark S.
Publication venue
Publication date: 12/04/2009
Field of study

Respondent-Driven Sampling (RDS) employs a variant of a link-tracing network sampling strategy to collect data from hard-to-reach populations. By tracing the links in the underlying social network, the process exploits the social structure to expand the sample and reduce its dependence on the initial (convenience) sample. The primary goal of RDS is typically to estimate population averages in the hard-to-reach population. The current estimates make strong assumptions in order to treat the data as a probability sample. In particular, we evaluate three critical sensitivities of the estimators: to bias induced by the initial sample, to uncontrollable features of respondent behavior, and to the without-replacement structure of sampling. This paper sounds a cautionary note for the users of RDS. While current RDS methodology is powerful and clever, the favorable statistical properties claimed for the current estimates are shown to be heavily dependent on often unrealistic assumptions.Comment: 35 pages, 29 figures, under revie

arXiv.org e-Print Archive

CiteSeerX

PubMed Central

eScholarship - University of California

Recommended from our members

Correcting for differential recruitment in respondent-driven sampling data using ego-network information

Author: Beaudry Isabelle S.
Gile Krista J.
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2020
Field of study

Respondent-Driven sampling (RDS) is a sampling method devised to overcome challenges with sampling hard-to-reach human populations. The sampling starts with a limited number of individuals who are asked to recruit a small number of their contacts. Every surveyed individual is subsequently given the same opportunity to recruit additional members of the target population until a pre-established sample size is achieved. The recruitment process consequently implies that the survey respondents are responsible for deciding who enters the study. Most RDS prevalence estimators assume that participants select among their contacts completely at random. The main objective of this work is to correct the inference for departure from this assumption, such as systematic recruitment based on the characteristics of the individuals or based on the nature of relationships. To accomplish this, we introduce three forms of non-random recruitment, provide estimators for these recruitment behaviors and extend three estimators and their associated variance procedures. The proposed methodology is assessed through a simulation study capturing various sampling and network features. Finally, the proposed methods are applied to a public health setting

ScholarWorks@UMass Amherst

Unequal Edge Inclusion Probabilities in Link-Tracing Network Sampling With Implications for Respondent-Driven Sampling

Author: Gile Krista J.
Ott Miles Q.
Publication venue: Smith ScholarWorks
Publication date: 01/01/2016
Field of study

Respondent-Driven Sampling (RDS) is a widely adopted linktracing sampling design used to draw valid statistical inference from samples of populations for which there is no available sampling frame. RDS estimators rely upon the assumption that each edge (representing a relationship between two individuals) in the underlying network has an equal probability of being sampled. We show that this assumption is violated in even the simplest cases, and that RDS estimators are sensitive to the violation of this assumption

Smith College: Smith ScholarWorks

Bayesian Peer Calibration with Application to Alcohol Use

Author: Barnett Nancy P.
Gile Krista J.
Hogan Joseph W.
Linkletter Crystal
Ott Miles Q.
Publication venue: Smith ScholarWorks
Publication date: 30/08/2016
Field of study

Peers are often able to provide important additional information to supplement self-reported behavioral measures. The study motivating this work collected data on alcohol in a social network formed by college students living in a freshman dormitory. By using two imperfect sources of information (self-reported and peer-reported alcohol consumption), rather than solely self-reports or peer-reports, we are able to gain insight into alcohol consumption on both the population and the individual level, as well as information on the discrepancy of individual peer-reports. We develop a novel Bayesian comparative calibration model for continuous, count and binary outcomes that uses covariate information to characterize the joint distribution of both self and peer-reports on the network for estimating peer-reporting discrepancies in network surveys, and apply this to the data for fully Bayesian inference. We use this model to understand the effects of covariates on both drinking behavior and peer-reporting discrepancies

Crossref

Smith College: Smith ScholarWorks

Reduced Bias for Respondent Driven Sampling: Accounting for Non-Uniform Edge Sampling Probabilities in People Who Inject Drugs in Mauritius

Author: Gile Krista J.
Harrison Matthew T.
Hogan Joseph W.
Johnston Lisa G.
Ott Miles Q.
Publication venue: Smith ScholarWorks
Publication date: 01/11/2019
Field of study

People who inject drugs are an important population to study in order to reduce transmission of blood-borne illnesses including HIV and Hepatitis. In this paper we estimate the HIV and Hepatitis C prevalence among people who inject drugs, as well as the proportion of people who inject drugs who are female in Mauritius. Respondent driven sampling (RDS), a widely adopted link-tracing sampling design used to collect samples from hard-to-reach human populations, was used to collect this sample. The random walk approximation underlying many common RDS estimators assumes that each social relation (edge) in the underlying social network has an equal probability of being traced in the collection of the sample. This assumption does not hold in practice. We show that certain RDS estimators are sensitive to the violation of this assumption. In order to address this limitation in current methodology, and the impact it may have on prevalence estimates, we present a new method for improving RDS prevalence estimators using estimated edge inclusion probabilities, and apply this to data from Mauritius

Smith College: Smith ScholarWorks