774 research outputs found

    Multiple‐systems analysis for the quantification of modern slavery: classical and Bayesian approaches

    Get PDF
    Multiple systems estimation is a key approach for quantifying hidden populations such as the number of victims of modern slavery. The UK Government published an estimate of 10,000 to 13,000 victims, constructed by the present author, as part of the strategy leading to the Modern Slavery Act 2015. This estimate was obtained by a stepwise multiple systems method based on six lists. Further investigation shows that a small proportion of the possible models give rather different answers, and that other model fitting approaches may choose one of these. Three data sets collected in the field of modern slavery, together with a data set about the death toll in the Kosovo conflict, are used to investigate the stability and robustness of various multiple systems estimate methods. The crucial aspect is the way that interactions between lists are modelled, because these can substantially affect the results. Model selection and Bayesian approaches are considered in detail, in particular to assess their stability and robustness when applied to real modern slavery data. A new Markov Chain Monte Carlo Bayesian approach is developed; overall, this gives robust and stable results at least for the examples considered. The software and datasets are freely and publicly available to facilitate wider implementation and further research

    Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models

    Get PDF
    Motivated by a real-life problem of sharing social network data that contain sensitive personal information, we propose a novel approach to release and analyze synthetic graphs in order to protect privacy of individual relationships captured by the social network while maintaining the validity of statistical results. A case study using a version of the Enron e-mail corpus dataset demonstrates the application and usefulness of the proposed techniques in solving the challenging problem of maintaining privacy \emph{and} supporting open access to network data to ensure reproducibility of existing studies and discovering new scientific insights that can be obtained by analyzing such data. We use a simple yet effective randomized response mechanism to generate synthetic networks under ϵ\epsilon-edge differential privacy, and then use likelihood based inference for missing data and Markov chain Monte Carlo techniques to fit exponential-family random graph models to the generated synthetic networks.Comment: Updated, 39 page

    Statistical Inference in a Directed Network Model with Covariates

    Get PDF
    Networks are often characterized by node heterogeneity for which nodes exhibit different degrees of interaction and link homophily for which nodes sharing common features tend to associate with each other. In this paper, we propose a new directed network model to capture the former via node-specific parametrization and the latter by incorporating covariates. In particular, this model quantifies the extent of heterogeneity in terms of outgoingness and incomingness of each node by different parameters, thus allowing the number of heterogeneity parameters to be twice the number of nodes. We study the maximum likelihood estimation of the model and establish the uniform consistency and asymptotic normality of the resulting estimators. Numerical studies demonstrate our theoretical findings and a data analysis confirms the usefulness of our model.Comment: 29 pages. minor revisio

    Gravitational and electromagnetic fields of a charged tachyon

    Full text link
    An axially symmetric exact solution of the Einstein-Maxwell equations is obtained and is interpreted to give the gravitational and electromagnetic fields of a charged tachyon. Switching off the charge parameter yields the solution for the uncharged tachyon which was earlier obtained by Vaidya. The null surfaces for the charged tachyon are discussed.Comment: 8 pages, LaTex, To appear in Pramana- J. Physic

    Model Selection for Degree-corrected Block Models

    Full text link
    The proliferation of models for networks raises challenging problems of model selection: the data are sparse and globally dependent, and models are typically high-dimensional and have large numbers of latent variables. Together, these issues mean that the usual model-selection criteria do not work properly for networks. We illustrate these challenges, and show one way to resolve them, by considering the key network-analysis problem of dividing a graph into communities or blocks of nodes with homogeneous patterns of links to the rest of the network. The standard tool for doing this is the stochastic block model, under which the probability of a link between two nodes is a function solely of the blocks to which they belong. This imposes a homogeneous degree distribution within each block; this can be unrealistic, so degree-corrected block models add a parameter for each node, modulating its over-all degree. The choice between ordinary and degree-corrected block models matters because they make very different inferences about communities. We present the first principled and tractable approach to model selection between standard and degree-corrected block models, based on new large-graph asymptotics for the distribution of log-likelihood ratios under the stochastic block model, finding substantial departures from classical results for sparse graphs. We also develop linear-time approximations for log-likelihoods under both the stochastic block model and the degree-corrected model, using belief propagation. Applications to simulated and real networks show excellent agreement with our approximations. Our results thus both solve the practical problem of deciding on degree correction, and point to a general approach to model selection in network analysis

    R.A.Fisher, design theory, and the Indian connection

    Get PDF
    Design Theory, a branch of mathematics, was born out of the experimental statistics research of the population geneticist R. A. Fisher and of Indian mathematical statisticians in the 1930s. The field combines elements of combinatorics, finite projective geometries, Latin squares, and a variety of further mathematical structures, brought together in surprising ways. This essay will present these structures and ideas as well as how the field came together, in itself an interesting story.Comment: 11 pages, 3 figure

    The harvest plot: A method for synthesising evidence about the differential effects of interventions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>One attraction of meta-analysis is the forest plot, a compact overview of the essential data included in a systematic review and the overall 'result'. However, meta-analysis is not always suitable for synthesising evidence about the effects of interventions which may influence the wider determinants of health. As part of a systematic review of the effects of population-level tobacco control interventions on social inequalities in smoking, we designed a novel approach to synthesis intended to bring aspects of the graphical directness of a forest plot to bear on the problem of synthesising evidence from a complex and diverse group of studies.</p> <p>Methods</p> <p>We coded the included studies (n = 85) on two methodological dimensions (suitability of study design and quality of execution) and extracted data on effects stratified by up to six different dimensions of inequality (income, occupation, education, gender, race or ethnicity, and age), distinguishing between 'hard' (behavioural) and 'intermediate' (process or attitudinal) outcomes. Adopting a hypothesis-testing approach, we then assessed which of three competing hypotheses (positive social gradient, negative social gradient, or no gradient) was best supported by each study for each dimension of inequality.</p> <p>Results</p> <p>We plotted the results on a matrix ('harvest plot') for each category of intervention, weighting studies by the methodological criteria and distributing them between the competing hypotheses. These matrices formed part of the analytical process and helped to encapsulate the output, for example by drawing attention to the finding that increasing the price of tobacco products may be more effective in discouraging smoking among people with lower incomes and in lower occupational groups.</p> <p>Conclusion</p> <p>The harvest plot is a novel and useful method for synthesising evidence about the differential effects of population-level interventions. It contributes to the challenge of making best use of all available evidence by incorporating all relevant data. The visual display assists both the process of synthesis and the assimilation of the findings. The method is suitable for adaptation to a variety of questions in evidence synthesis and may be particularly useful for systematic reviews addressing the broader type of research question which may be most relevant to policymakers.</p
    corecore