50 research outputs found
Identifying overlapping terrorist cells from the Noordin Top actor-event network
Actor-event data are common in sociological settings, whereby one registers
the pattern of attendance of a group of social actors to a number of events. We
focus on 79 members of the Noordin Top terrorist network, who were monitored
attending 45 events. The attendance or non-attendance of the terrorist to
events defines the social fabric, such as group coherence and social
communities. The aim of the analysis of such data is to learn about the
affiliation structure. Actor-event data is often transformed to actor-actor
data in order to be further analysed by network models, such as stochastic
block models. This transformation and such analyses lead to a natural loss of
information, particularly when one is interested in identifying, possibly
overlapping, subgroups or communities of actors on the basis of their
attendances to events. In this paper we propose an actor-event model for
overlapping communities of terrorists, which simplifies interpretation of the
network. We propose a mixture model with overlapping clusters for the analysis
of the binary actor-event network data, called {\tt manet}, and develop a
Bayesian procedure for inference. After a simulation study, we show how this
analysis of the terrorist network has clear interpretative advantages over the
more traditional approaches of affiliation network analysis.Comment: 24 pages, 5 figures; related R package (manet) available on CRA
Bayesian Structural Learning with Parametric Marginals for Count Data: An Application to Microbiota Systems
High dimensional and heterogeneous count data are collected in various
applied fields. In this paper, we look closely at high-resolution sequencing
data on the microbiome, which have enabled researchers to study the genomes of
entire microbial communities. Revealing the underlying interactions between
these communities is of vital importance to learn how microbes influence human
health. To perform structural learning from multivariate count data such as
these, we develop a novel Gaussian copula graphical model with two key
elements. Firstly, we employ parametric regression to characterize the marginal
distributions. This step is crucial for accommodating the impact of external
covariates. Neglecting this adjustment could potentially introduce distortions
in the inference of the underlying network of dependences. Secondly, we advance
a Bayesian structure learning framework, based on a computationally efficient
search algorithm that is suited to high dimensionality. The approach returns
simultaneous inference of the marginal effects and of the dependence structure,
including graph uncertainty estimates. A simulation study and a real data
analysis of microbiome data highlight the applicability of the proposed
approach at inferring networks from multivariate count data in general, and its
relevance to microbiome analyses in particular. The proposed method is
implemented in the R package BDgraph
The network structure of cultural distances
This paper proposes a novel measure of cultural distances between countries.
Making use of the information coming from the World Value Survey (Wave 6), and
considering the interdependence among cultural traits, the paper proposes a
methodology to define the cultural distance between countries, that takes into
account the network structure of national cultural traits. Exploiting the
possibilities offered by Copula graphical models for ordinal and categorical
data, the paper infers the network structure of 54 countries and proposes a new
summary measure of national cultural distances. The DBRV Cultural Distance
index shows that, as for 2010-2014, compared to Inglehart and Welzel (2005) the
world appears to be more culturally heterogeneous than what it was previously
thought.Comment: 64 pages, 67 figures, 4 table
Latent event history models for quasi-reaction systems
Various processes can be modelled as quasi-reaction systems of stochastic
differential equations, such as cell differentiation and disease spreading.
Since the underlying data of particle interactions, such as reactions between
proteins or contacts between people, are typically unobserved, statistical
inference of the parameters driving these systems is developed from
concentration data measuring each unit in the system over time. While observing
the continuous time process at a time scale as fine as possible should in
theory help with parameter estimation, the existing Local Linear Approximation
(LLA) methods fail in this case, due to numerical instability caused by small
changes of the system at successive time points. On the other hand, one may be
able to reconstruct the underlying unobserved interactions from the observed
count data. Motivated by this, we first formalise the latent event history
model underlying the observed count process. We then propose a computationally
efficient Expectation-Maximation algorithm for parameter estimation, with an
extended Kalman filtering procedure for the prediction of the latent states. A
simulation study shows the performance of the proposed method and highlights
the settings where it is particularly advantageous compared to the existing LLA
approaches. Finally, we present an illustration of the methodology on the
spreading of the COVID-19 pandemic in Italy
Hospital Quality Interdependence in a Competitive Institutional Environment: Evidence from Italy
In this paper we explore the geographical scope of hospital competition on quality, using Italian data on over 207,000 patients admitted to 174 hospitals located in the Lombardy region in the years 2008–2014. We propose an economic framework that incorporates both local and global forms of quality competition among hospitals, the latter emerging from periodically released hospital performance rankings. Under this framework, we derive the hospital reaction functions and, accordingly, we characterize the structure of interdependence among hospital qualities. We employ recent methods from the graphical modelling literature to estimate the set of local rivals for each hospital, as well as the degree of global interdependence among hospitals. Consistently with our micro-founded framework, our results show a significant positive degree of short- and long-range dependence, suggesting the existence of forms of local and global competition amongst hospitals with relevant implications for health care policy
Mixtures of multivariate generalized linear models with overlapping clusters
With the advent of ubiquitous monitoring and measurement protocols, studies
have started to focus more and more on complex, multivariate and heterogeneous
datasets. In such studies, multivariate response variables are drawn from a
heterogeneous population often in the presence of additional covariate
information. In order to deal with this intrinsic heterogeneity, regression
analyses have to be clustered for different groups of units. Up until now,
mixture model approaches assigned units to distinct and non-overlapping groups.
However, not rarely these units exhibit more complex organization and
clustering. It is our aim to define a mixture of generalized linear models with
overlapping clusters of units. This involves crucially an overlap function,
that maps the coefficients of the parent clusters into the the coefficient of
the multiple allocation units. We present a computationally efficient MCMC
scheme that samples the posterior distribution of the parameters in the model.
An example on a two-mode network study shows details of the implementation in
the case of a multivariate probit regression setting. A simulation study shows
the overall performance of the method, whereas an illustration of the voting
behaviour on the US supreme court shows how the 9 justices split in two
overlapping sets of justices.Comment: 24 pages, 3 figure