1,255 research outputs found
Mixture models and exploratory analysis in networks
Networks are widely used in the biological, physical, and social sciences as
a concise mathematical representation of the topology of systems of interacting
components. Understanding the structure of these networks is one of the
outstanding challenges in the study of complex systems. Here we describe a
general technique for detecting structural features in large-scale network data
which works by dividing the nodes of a network into classes such that the
members of each class have similar patterns of connection to other nodes. Using
the machinery of probabilistic mixture models and the expectation-maximization
algorithm, we show that it is possible to detect, without prior knowledge of
what we are looking for, a very broad range of types of structure in networks.
We give a number of examples demonstrating how the method can be used to shed
light on the properties of real-world networks, including social and
information networks.Comment: 8 pages, 4 figures, two new examples in this version plus minor
correction
Semi-Supervised Overlapping Community Finding based on Label Propagation with Pairwise Constraints
Algorithms for detecting communities in complex networks are generally
unsupervised, relying solely on the structure of the network. However, these
methods can often fail to uncover meaningful groupings that reflect the
underlying communities in the data, particularly when those structures are
highly overlapping. One way to improve the usefulness of these algorithms is by
incorporating additional background information, which can be used as a source
of constraints to direct the community detection process. In this work, we
explore the potential of semi-supervised strategies to improve algorithms for
finding overlapping communities in networks. Specifically, we propose a new
method, based on label propagation, for finding communities using a limited
number of pairwise constraints. Evaluations on synthetic and real-world
datasets demonstrate the potential of this approach for uncovering meaningful
community structures in cases where each node can potentially belong to more
than one community.Comment: Fix table
Assessing the association between oral hygiene and preterm birth by quantitative light-induced fluorescence
The aim of this study was to investigate the purported link between oral hygiene and preterm birth by using image analysis tools to quantify dental plaque biofilm. Volunteers (η = 91) attending an antenatal clinic were identified as those considered to be “at high risk” of preterm delivery (i.e., a previous history of idiopathic preterm delivery, case group) or those who were not considered to be at risk (control group). The women had images of their anterior teeth captured using quantitative light-induced fluorescence (QLF). These images were analysed to calculate the amount of red fluorescent plaque (ΔR%) and percentage of plaque coverage. QLF showed little difference in ΔR% between the two groups, 65.00% case versus 68.70% control, whereas there was 19.29% difference with regard to the mean plaque coverage, 25.50% case versus 20.58% control. A logistic regression model showed a significant association between plaque coverage and case/control status (Ρ = 0.031), controlling for other potential predictor variables, namely, smoking status, maternal age, and body mass index (BMI)
Community Structure in Time-Dependent, Multiscale, and Multiplex Networks
Network science is an interdisciplinary endeavor, with methods and
applications drawn from across the natural, social, and information sciences. A
prominent problem in network science is the algorithmic detection of
tightly-connected groups of nodes known as communities. We developed a
generalized framework of network quality functions that allowed us to study the
community structure of arbitrary multislice networks, which are combinations of
individual networks coupled through links that connect each node in one network
slice to itself in other slices. This framework allows one to study community
structure in a very general setting encompassing networks that evolve over
time, have multiple types of links (multiplexity), and have multiple scales.Comment: 31 pages, 3 figures, 1 table. Includes main text and supporting
material. This is the accepted version of the manuscript (the definitive
version appeared in Science), with typographical corrections included her
Distributed Community Detection in Dynamic Graphs
Inspired by the increasing interest in self-organizing social opportunistic
networks, we investigate the problem of distributed detection of unknown
communities in dynamic random graphs. As a formal framework, we consider the
dynamic version of the well-studied \emph{Planted Bisection Model}
\sdG(n,p,q) where the node set of the network is partitioned into two
unknown communities and, at every time step, each possible edge is
active with probability if both nodes belong to the same community, while
it is active with probability (with ) otherwise. We also consider a
time-Markovian generalization of this model.
We propose a distributed protocol based on the popular \emph{Label
Propagation Algorithm} and prove that, when the ratio is larger than
(for an arbitrarily small constant ), the protocol finds the right
"planted" partition in time even when the snapshots of the dynamic
graph are sparse and disconnected (i.e. in the case ).Comment: Version I
Distance, dissimilarity index, and network community structure
We address the question of finding the community structure of a complex
network. In an earlier effort [H. Zhou, {\em Phys. Rev. E} (2003)], the concept
of network random walking is introduced and a distance measure defined. Here we
calculate, based on this distance measure, the dissimilarity index between
nearest-neighboring vertices of a network and design an algorithm to partition
these vertices into communities that are hierarchically organized. Each
community is characterized by an upper and a lower dissimilarity threshold. The
algorithm is applied to several artificial and real-world networks, and
excellent results are obtained. In the case of artificially generated random
modular networks, this method outperforms the algorithm based on the concept of
edge betweenness centrality. For yeast's protein-protein interaction network,
we are able to identify many clusters that have well defined biological
functions.Comment: 10 pages, 7 figures, REVTeX4 forma
Exploiting Resolution-based Representations for MaxSAT Solving
Most recent MaxSAT algorithms rely on a succession of calls to a SAT solver
in order to find an optimal solution. In particular, several algorithms take
advantage of the ability of SAT solvers to identify unsatisfiable subformulas.
Usually, these MaxSAT algorithms perform better when small unsatisfiable
subformulas are found early. However, this is not the case in many problem
instances, since the whole formula is given to the SAT solver in each call. In
this paper, we propose to partition the MaxSAT formula using a resolution-based
graph representation. Partitions are then iteratively joined by using a
proximity measure extracted from the graph representation of the formula. The
algorithm ends when only one partition remains and the optimal solution is
found. Experimental results show that this new approach further enhances a
state of the art MaxSAT solver to optimally solve a larger set of industrial
problem instances
Managing clustering effects and learning effects in the design and analysis of multicentre randomised trials: a survey to establish current practice.
BACKGROUND:Patient outcomes can depend on the treating centre, or health professional, delivering the intervention. A health professional's skill in delivery improves with experience, meaning that outcomes may be associated with learning. Considering differences in intervention delivery at trial design will ensure that any appropriate adjustments can be made during analysis. This work aimed to establish practice for the allowance of clustering and learning effects in the design and analysis of randomised multicentre trials. METHODS:A survey that drew upon quotes from existing guidelines, references to relevant publications and example trial scenarios was delivered. Registered UK Clinical Research Collaboration Registered Clinical Trials Units were invited to participate. RESULTS:Forty-four Units participated (N = 50). Clustering was managed through design by stratification, more commonly by centre than by treatment provider. Managing learning by design through defining a minimum expertise level for treatment provider was common (89%). One-third reported experience in expertise-based designs. The majority of Units had adjusted for clustering during analysis, although approaches varied. Analysis of learning was rarely performed for the main analysis (n = 1), although it was explored by other means. The insight behind the approaches used within and reasons for, or against, alternative approaches were provided. CONCLUSIONS:Widespread awareness of challenges in designing and analysing multicentre trials is identified. Approaches used, and opinions on these, vary both across and within Units, indicating that approaches are dependent on the type of trial. Agreeing principles to guide trial design and analysis across a range of realistic clinical scenarios should be considered
- …