227 research outputs found

    A Conditional Empirical Likelihood Based Method for Model Parameter Estimation from Complex survey Datasets

    Full text link
    We consider an empirical likelihood framework for inference for a statistical model based on an informative sampling design. Covariate information is incorporated both through the weights and the estimating equations. The estimator is based on conditional weights. We show that under usual conditions, with population size increasing unbounded, the estimates are strongly consistent, asymptotically unbiased and normally distributed. Our framework provides additional justification for inverse probability weighted score estimators in terms of conditional empirical likelihood. In doing so, it bridges the gap between design-based and model-based modes of inference in survey sampling settings. We illustrate these ideas with an application to an electoral survey

    Exponential-family Random Network Models

    Full text link
    Random graphs, where the connections between nodes are considered random variables, have wide applicability in the social sciences. Exponential-family Random Graph Models (ERGM) have shown themselves to be a useful class of models for representing com- plex social phenomena. We generalize ERGM by also modeling nodal attributes as random variates, thus creating a random model of the full network, which we call Exponential-family Random Network Models (ERNM). We demonstrate how this framework allows a new formu- lation for logistic regression in network data. We develop likelihood-based inference for the model and an MCMC algorithm to implement it. This new model formulation is used to analyze a peer social network from the National Lon- gitudinal Study of Adolescent Health. We model the relationship between substance use and friendship relations, and show how the results differ from the standard use of logistic regression on network data

    networksis: A Package to Simulate Bipartite Graphs with Fixed Marginals Through Sequential Importance Sampling

    Get PDF
    The ability to simulate graphs with given properties is important for the analysis of social networks. Sequential importance sampling has been shown to be particularly effective in estimating the number of graphs adhering to fixed marginals and in estimating the null distribution of graph statistics. This paper describes the networksis package for R and how its simulate and simulate_sis functions can be used to address both of these tasks as well as generate initial graphs for Markov chain Monte Carlo simulations

    Modeling social networks from sampled data

    Full text link
    Network models are widely used to represent relational information among interacting units and the structural implications of these relations. Recently, social network studies have focused a great deal of attention on random graph models of networks whose nodes represent individual social actors and whose edges represent a specified relationship between the actors. Most inference for social network models assumes that the presence or absence of all possible links is observed, that the information is completely reliable, and that there are no measurement (e.g., recording) errors. This is clearly not true in practice, as much network data is collected though sample surveys. In addition even if a census of a population is attempted, individuals and links between individuals are missed (i.e., do not appear in the recorded data). In this paper we develop the conceptual and computational theory for inference based on sampled network information. We first review forms of network sampling designs used in practice. We consider inference from the likelihood framework, and develop a typology of network data that reflects their treatment within this frame. We then develop inference for social network models based on information from adaptive network designs. We motivate and illustrate these ideas by analyzing the effect of link-tracing sampling designs on a collaboration network.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS221 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Analysis of Partially Observed Networks via Exponential-family Random Network Models

    Full text link
    Exponential-family random network (ERN) models specify a joint representation of both the dyads of a network and nodal characteristics. This class of models allow the nodal characteristics to be modelled as stochastic processes, expanding the range and realism of exponential-family approaches to network modelling. In this paper we develop a theory of inference for ERN models when only part of the network is observed, as well as specific methodology for missing data, including non-ignorable mechanisms for network-based sampling designs and for latent class models. In particular, we consider data collected via contact tracing, of considerable importance to infectious disease epidemiology and public health

    A description of within-family resource exchange networks in a Malawian village

    Get PDF
    In this paper we explore patterns of economic transfers between adults within household and family networks in a village in Malawi’s Rumphi district, using data from the 2006 round of the Malawi Longitudinal Study of Families and Health. We fit Exponential-family Random Graph Models (ERGMs) to assess individual, relational, and higher-order network effects. The network effects of cyclic giving, reciprocity, and in-degree and out-degree distribution suggest a network with a tendency away from the formation of hierarchies or "hubs." Effects of age, sex, working status, education, health status, and kinship relation are also considered.Malawi, Malawi Longitudinal Study of Families and Health, networks, resource exchange, social network

    On the Concept of Snowball Sampling

    Full text link
    This brief comment reflects on the historical and current uses of the term "snowball sampling."Comment: 5 pages, 0 figures. To appear in Sociological Methodolog

    Respondent-Driven Sampling: An Assessment of Current Methodology

    Full text link
    Respondent-Driven Sampling (RDS) employs a variant of a link-tracing network sampling strategy to collect data from hard-to-reach populations. By tracing the links in the underlying social network, the process exploits the social structure to expand the sample and reduce its dependence on the initial (convenience) sample. The primary goal of RDS is typically to estimate population averages in the hard-to-reach population. The current estimates make strong assumptions in order to treat the data as a probability sample. In particular, we evaluate three critical sensitivities of the estimators: to bias induced by the initial sample, to uncontrollable features of respondent behavior, and to the without-replacement structure of sampling. This paper sounds a cautionary note for the users of RDS. While current RDS methodology is powerful and clever, the favorable statistical properties claimed for the current estimates are shown to be heavily dependent on often unrealistic assumptions.Comment: 35 pages, 29 figures, under revie

    Fitting Latent Cluster Models for Networks with latentnet

    Get PDF
    latentnet is a package to fit and evaluate statistical latent position and cluster models for networks. Hoff, Raftery, and Handcock (2002) suggested an approach to modeling networks based on positing the existence of an latent space of characteristics of the actors. Relationships form as a function of distances between these characteristics as well as functions of observed dyadic level covariates. In latentnet social distances are represented in a Euclidean space. It also includes a variant of the extension of the latent position model to allow for clustering of the positions developed in Handcock, Raftery, and Tantrum (2007). The package implements Bayesian inference for the models based on an Markov chain Monte Carlo algorithm. It can also compute maximum likelihood estimates for the latent position model and a two-stage maximum likelihood method for the latent position cluster model. For latent position cluster models, the package provides a Bayesian way of assessing how many groups there are, and thus whether or not there is any clustering (since if the preferred number of groups is 1, there is little evidence for clustering). It also estimates which cluster each actor belongs to. These estimates are probabilistic, and provide the probability of each actor belonging to each cluster. It computes four types of point estimates for the coefficients and positions: maximum likelihood estimate, posterior mean, posterior mode and the estimator which minimizes Kullback-Leibler divergence from the posterior. You can assess the goodness-of-fit of the model via posterior predictive checks. It has a function to simulate networks from a latent position or latent position cluster model.

    On "Sexual contacts and epidemic thresholds," models and inference for Sexual partnership distributions

    Full text link
    Recent work has focused attention on statistical inference for the population distribution of the number of sexual partners based on survey data. The characteristics of these distributions are of interest as components of mathematical models for the transmission dynamics of sexually-transmitted diseases (STDs). Such information can be used both to calibrate theoretical models, to make predictions for real populations, and as a tool for guiding public health policy. Our previous work on this subject has developed likelihood-based statistical methods for inference that allow for low-dimensional, semi-parametric models. Inference has been based on several proposed stochastic process models for the formation of sexual partnership networks. We have also developed model selection criteria to choose between competing models, and assessed the fit of different models to three populations: Uganda, Sweden, and the USA. Throughout this work, we have emphasized the correct assessment of the uncertainty of the estimates based on the data analyzed. We have also widened the question of interest to the limitations of inferences from such data, and the utility of degree-based epidemiological models more generally. In this paper we address further statistical issues that are important in this area, and a number of confusions that have arisen in interpreting our work. In particular, we consider the use of cumulative lifetime partner distributions, heaping and other issues raised by Liljeros et al. in a recent working paper.Comment: 22 pages, 5 figures in linked working pape
    • …
    corecore