50 research outputs found

    Identifying overlapping terrorist cells from the Noordin Top actor-event network

    Get PDF
    Actor-event data are common in sociological settings, whereby one registers the pattern of attendance of a group of social actors to a number of events. We focus on 79 members of the Noordin Top terrorist network, who were monitored attending 45 events. The attendance or non-attendance of the terrorist to events defines the social fabric, such as group coherence and social communities. The aim of the analysis of such data is to learn about the affiliation structure. Actor-event data is often transformed to actor-actor data in order to be further analysed by network models, such as stochastic block models. This transformation and such analyses lead to a natural loss of information, particularly when one is interested in identifying, possibly overlapping, subgroups or communities of actors on the basis of their attendances to events. In this paper we propose an actor-event model for overlapping communities of terrorists, which simplifies interpretation of the network. We propose a mixture model with overlapping clusters for the analysis of the binary actor-event network data, called {\tt manet}, and develop a Bayesian procedure for inference. After a simulation study, we show how this analysis of the terrorist network has clear interpretative advantages over the more traditional approaches of affiliation network analysis.Comment: 24 pages, 5 figures; related R package (manet) available on CRA

    Bayesian Structural Learning with Parametric Marginals for Count Data: An Application to Microbiota Systems

    Full text link
    High dimensional and heterogeneous count data are collected in various applied fields. In this paper, we look closely at high-resolution sequencing data on the microbiome, which have enabled researchers to study the genomes of entire microbial communities. Revealing the underlying interactions between these communities is of vital importance to learn how microbes influence human health. To perform structural learning from multivariate count data such as these, we develop a novel Gaussian copula graphical model with two key elements. Firstly, we employ parametric regression to characterize the marginal distributions. This step is crucial for accommodating the impact of external covariates. Neglecting this adjustment could potentially introduce distortions in the inference of the underlying network of dependences. Secondly, we advance a Bayesian structure learning framework, based on a computationally efficient search algorithm that is suited to high dimensionality. The approach returns simultaneous inference of the marginal effects and of the dependence structure, including graph uncertainty estimates. A simulation study and a real data analysis of microbiome data highlight the applicability of the proposed approach at inferring networks from multivariate count data in general, and its relevance to microbiome analyses in particular. The proposed method is implemented in the R package BDgraph

    The network structure of cultural distances

    Get PDF
    This paper proposes a novel measure of cultural distances between countries. Making use of the information coming from the World Value Survey (Wave 6), and considering the interdependence among cultural traits, the paper proposes a methodology to define the cultural distance between countries, that takes into account the network structure of national cultural traits. Exploiting the possibilities offered by Copula graphical models for ordinal and categorical data, the paper infers the network structure of 54 countries and proposes a new summary measure of national cultural distances. The DBRV Cultural Distance index shows that, as for 2010-2014, compared to Inglehart and Welzel (2005) the world appears to be more culturally heterogeneous than what it was previously thought.Comment: 64 pages, 67 figures, 4 table

    Latent event history models for quasi-reaction systems

    Full text link
    Various processes can be modelled as quasi-reaction systems of stochastic differential equations, such as cell differentiation and disease spreading. Since the underlying data of particle interactions, such as reactions between proteins or contacts between people, are typically unobserved, statistical inference of the parameters driving these systems is developed from concentration data measuring each unit in the system over time. While observing the continuous time process at a time scale as fine as possible should in theory help with parameter estimation, the existing Local Linear Approximation (LLA) methods fail in this case, due to numerical instability caused by small changes of the system at successive time points. On the other hand, one may be able to reconstruct the underlying unobserved interactions from the observed count data. Motivated by this, we first formalise the latent event history model underlying the observed count process. We then propose a computationally efficient Expectation-Maximation algorithm for parameter estimation, with an extended Kalman filtering procedure for the prediction of the latent states. A simulation study shows the performance of the proposed method and highlights the settings where it is particularly advantageous compared to the existing LLA approaches. Finally, we present an illustration of the methodology on the spreading of the COVID-19 pandemic in Italy

    Hospital Quality Interdependence in a Competitive Institutional Environment: Evidence from Italy

    Get PDF
    In this paper we explore the geographical scope of hospital competition on quality, using Italian data on over 207,000 patients admitted to 174 hospitals located in the Lombardy region in the years 2008–2014. We propose an economic framework that incorporates both local and global forms of quality competition among hospitals, the latter emerging from periodically released hospital performance rankings. Under this framework, we derive the hospital reaction functions and, accordingly, we characterize the structure of interdependence among hospital qualities. We employ recent methods from the graphical modelling literature to estimate the set of local rivals for each hospital, as well as the degree of global interdependence among hospitals. Consistently with our micro-founded framework, our results show a significant positive degree of short- and long-range dependence, suggesting the existence of forms of local and global competition amongst hospitals with relevant implications for health care policy

    Mixtures of multivariate generalized linear models with overlapping clusters

    Full text link
    With the advent of ubiquitous monitoring and measurement protocols, studies have started to focus more and more on complex, multivariate and heterogeneous datasets. In such studies, multivariate response variables are drawn from a heterogeneous population often in the presence of additional covariate information. In order to deal with this intrinsic heterogeneity, regression analyses have to be clustered for different groups of units. Up until now, mixture model approaches assigned units to distinct and non-overlapping groups. However, not rarely these units exhibit more complex organization and clustering. It is our aim to define a mixture of generalized linear models with overlapping clusters of units. This involves crucially an overlap function, that maps the coefficients of the parent clusters into the the coefficient of the multiple allocation units. We present a computationally efficient MCMC scheme that samples the posterior distribution of the parameters in the model. An example on a two-mode network study shows details of the implementation in the case of a multivariate probit regression setting. A simulation study shows the overall performance of the method, whereas an illustration of the voting behaviour on the US supreme court shows how the 9 justices split in two overlapping sets of justices.Comment: 24 pages, 3 figure
    corecore