Extensions Of Respondent-Driven Sampling: Web-Based Rds, Empirical Validation, And The Dual Homophily Model

Wejnert, Cyprian

thesis

Extensions Of Respondent-Driven Sampling: Web-Based Rds, Empirical Validation, And The Dual Homophily Model

Authors: Cyprian Wejnert
Publication date: 19 August 2009
Publisher

Abstract

This dissertation makes contributions to Respondent-Driven Sampling (RDS) and the study of social networks. RDS is a new network-based method of collecting and analyzing data from hidden populations in a statistically viable way. The first chapter provides an introduction to RDS procedures and estimation. After describing the operating procedures, the chapter introduces the statistical theory behind RDS, including the models assumptions and how it accounts for sources of bias commonly associated with network samples. It then compares two distinct families of RDS estimator, RDS I and RDS II, by describing the evolution of all seven RDS estimators. Chapter Two introduces WebRDS, an online version of RDS that has been shown to produce samples in record speeds, and describes the two WebRDS samples on which the remaining analyses are based. Chapter Three provides an in depth empirical test of RDS estimators and confidence intervals. While RDS estimation has been validated analytically and computationally, it has not been empirically tested on a population with known parameters. Chapter Three utilizes RDS data on university undergraduates to compare the accuracy of RDS point and variance estimates across two estimation techniques (RDS I and RDS II), self-report measures of degree, and multiple cutpoints for excluding early wave data. The chapter RDS I and RDS II estimates to be accurate and convergent, but estimates of variance to be problematic in opposite ways. The RDS I bootstrap method tends to under estimate variance, while RDS II analytical variance estimation provides an over estimate. For both methods, the problem is exacerbated in small groups. Differences in degree measure and cutting early wave data resulted in only minor differences in the estimation. Chapter Four presents the Dual Homophily Model, which breaks a common measure of homophily into two components, one due to relational preferences and one due to differential degree. Applications of the model, including examples where standard homophily measures miss important differences between groups, are discussed

Similar works

Full text

Available Versions

eCommons@Cornell

oai:ecommons.cornell.edu:1813/...

Last time updated on 08/03/2017