165 research outputs found
A Conditional Empirical Likelihood Based Method for Model Parameter Estimation from Complex survey Datasets
We consider an empirical likelihood framework for inference for a statistical
model based on an informative sampling design. Covariate information is
incorporated both through the weights and the estimating equations. The
estimator is based on conditional weights. We show that under usual conditions,
with population size increasing unbounded, the estimates are strongly
consistent, asymptotically unbiased and normally distributed. Our framework
provides additional justification for inverse probability weighted score
estimators in terms of conditional empirical likelihood. In doing so, it
bridges the gap between design-based and model-based modes of inference in
survey sampling settings. We illustrate these ideas with an application to an
electoral survey
networksis: A Package to Simulate Bipartite Graphs with Fixed Marginals Through Sequential Importance Sampling
The ability to simulate graphs with given properties is important for the analysis of social networks. Sequential importance sampling has been shown to be particularly effective in estimating the number of graphs adhering to fixed marginals and in estimating the null distribution of graph statistics. This paper describes the networksis package for R and how its simulate and simulate_sis functions can be used to address both of these tasks as well as generate initial graphs for Markov chain Monte Carlo simulations
Modeling social networks from sampled data
Network models are widely used to represent relational information among
interacting units and the structural implications of these relations. Recently,
social network studies have focused a great deal of attention on random graph
models of networks whose nodes represent individual social actors and whose
edges represent a specified relationship between the actors. Most inference for
social network models assumes that the presence or absence of all possible
links is observed, that the information is completely reliable, and that there
are no measurement (e.g., recording) errors. This is clearly not true in
practice, as much network data is collected though sample surveys. In addition
even if a census of a population is attempted, individuals and links between
individuals are missed (i.e., do not appear in the recorded data). In this
paper we develop the conceptual and computational theory for inference based on
sampled network information. We first review forms of network sampling designs
used in practice. We consider inference from the likelihood framework, and
develop a typology of network data that reflects their treatment within this
frame. We then develop inference for social network models based on information
from adaptive network designs. We motivate and illustrate these ideas by
analyzing the effect of link-tracing sampling designs on a collaboration
network.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS221 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Analysis of Partially Observed Networks via Exponential-family Random Network Models
Exponential-family random network (ERN) models specify a joint representation
of both the dyads of a network and nodal characteristics. This class of models
allow the nodal characteristics to be modelled as stochastic processes,
expanding the range and realism of exponential-family approaches to network
modelling. In this paper we develop a theory of inference for ERN models when
only part of the network is observed, as well as specific methodology for
missing data, including non-ignorable mechanisms for network-based sampling
designs and for latent class models. In particular, we consider data collected
via contact tracing, of considerable importance to infectious disease
epidemiology and public health
A description of within-family resource exchange networks in a Malawian village
In this paper we explore patterns of economic transfers between adults within household and family networks in a village in Malawi’s Rumphi district, using data from the 2006 round of the Malawi Longitudinal Study of Families and Health. We fit Exponential-family Random Graph Models (ERGMs) to assess individual, relational, and higher-order network effects. The network effects of cyclic giving, reciprocity, and in-degree and out-degree distribution suggest a network with a tendency away from the formation of hierarchies or "hubs." Effects of age, sex, working status, education, health status, and kinship relation are also considered.Malawi, Malawi Longitudinal Study of Families and Health, networks, resource exchange, social network
Estimating within-household contact networks from egocentric data
Acute respiratory diseases are transmitted over networks of social contacts.
Large-scale simulation models are used to predict epidemic dynamics and
evaluate the impact of various interventions, but the contact behavior in these
models is based on simplistic and strong assumptions which are not informed by
survey data. These assumptions are also used for estimating transmission
measures such as the basic reproductive number and secondary attack rates.
Development of methodology to infer contact networks from survey data could
improve these models and estimation methods. We contribute to this area by
developing a model of within-household social contacts and using it to analyze
the Belgian POLYMOD data set, which contains detailed diaries of social
contacts in a 24-hour period. We model dependency in contact behavior through a
latent variable indicating which household members are at home. We estimate
age-specific probabilities of being at home and age-specific probabilities of
contact conditional on two members being at home. Our results differ from the
standard random mixing assumption. In addition, we find that the probability
that all members contact each other on a given day is fairly low: 0.49 for
households with two 0--5 year olds and two 19--35 year olds, and 0.36 for
households with two 12--18 year olds and two 36+ year olds. We find higher
contact rates in households with 2--3 members, helping explain the higher
influenza secondary attack rates found in households of this size.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS474 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Estimating within-school contact networks to understand influenza transmission
Many epidemic models approximate social contact behavior by assuming random
mixing within mixing groups (e.g., homes, schools and workplaces). The effect
of more realistic social network structure on estimates of epidemic parameters
is an open area of exploration. We develop a detailed statistical model to
estimate the social contact network within a high school using friendship
network data and a survey of contact behavior. Our contact network model
includes classroom structure, longer durations of contacts to friends than
nonfriends and more frequent contacts with friends, based on reports in the
contact survey. We performed simulation studies to explore which network
structures are relevant to influenza transmission. These studies yield two key
findings. First, we found that the friendship network structure important to
the transmission process can be adequately represented by a dyad-independent
exponential random graph model (ERGM). This means that individual-level sampled
data is sufficient to characterize the entire friendship network. Second, we
found that contact behavior was adequately represented by a static rather than
dynamic contact network.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS505 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …