Search CORE

1,255 research outputs found

Mixture models and exploratory analysis in networks

Author: E. A. Leicht
Girvan
Jones
M. E. J. Newman
Milo
Newman
Palla
Pastor-Satorras
Reichardt
Watts
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 15/11/2006
Field of study

Networks are widely used in the biological, physical, and social sciences as a concise mathematical representation of the topology of systems of interacting components. Understanding the structure of these networks is one of the outstanding challenges in the study of complex systems. Here we describe a general technique for detecting structural features in large-scale network data which works by dividing the nodes of a network into classes such that the members of each class have similar patterns of connection to other nodes. Using the machinery of probabilistic mixture models and the expectation-maximization algorithm, we show that it is possible to detect, without prior knowledge of what we are looking for, a very broad range of types of structure in networks. We give a number of examples demonstrating how the method can be used to shed light on the properties of real-world networks, including social and information networks.Comment: 8 pages, 4 figures, two new examples in this version plus minor correction

arXiv.org e-Print Archive

Crossref

PubMed Central

Oxford University Research Archive

CERN Document Server

Semi-Supervised Overlapping Community Finding based on Label Propagation with Pairwise Constraints

Author: A Amelio
A Clauset
A Lancichinetti
A Lancichinetti
A Lancichinetti
D Liu
M Girvan
ME Newman
S Fortunato
V Blondel
YY Ahn
ZY Zhang
Publication venue
Publication date: 17/10/2018
Field of study

Algorithms for detecting communities in complex networks are generally unsupervised, relying solely on the structure of the network. However, these methods can often fail to uncover meaningful groupings that reflect the underlying communities in the data, particularly when those structures are highly overlapping. One way to improve the usefulness of these algorithms is by incorporating additional background information, which can be used as a source of constraints to direct the community detection process. In this work, we explore the potential of semi-supervised strategies to improve algorithms for finding overlapping communities in networks. Specifically, we propose a new method, based on label propagation, for finding communities using a limited number of pairwise constraints. Evaluations on synthetic and real-world datasets demonstrate the potential of this approach for uncovering meaningful community structures in cases where each node can potentially belong to more than one community.Comment: Fix table

arXiv.org e-Print Archive

Crossref

Research Repository UCD

Assessing the association between oral hygiene and preterm birth by quantitative light-induced fluorescence

Author: Adeyemi Adejumoke A.
Burnside Girvan
Higham Susan M.
Hope Christopher K.
Quenby Siobhan
Smith Philip W.
Wang Qian
Whitworth Melissa
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

The aim of this study was to investigate the purported link between oral hygiene and preterm birth by using image analysis tools to quantify dental plaque biofilm. Volunteers (η = 91) attending an antenatal clinic were identified as those considered to be “at high risk” of preterm delivery (i.e., a previous history of idiopathic preterm delivery, case group) or those who were not considered to be at risk (control group). The women had images of their anterior teeth captured using quantitative light-induced fluorescence (QLF). These images were analysed to calculate the amount of red fluorescent plaque (ΔR%) and percentage of plaque coverage. QLF showed little difference in ΔR% between the two groups, 65.00% case versus 68.70% control, whereas there was 19.29% difference with regard to the mean plaque coverage, 25.50% case versus 20.58% control. A logistic regression model showed a significant association between plaque coverage and case/control status (Ρ = 0.031), controlling for other potential predictor variables, namely, smoking status, maternal age, and body mass index (BMI)

University of Liverpool Repository

Crossref

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

The University of Manchester - Institutional Repository

Community Structure in Time-Dependent, Multiscale, and Multiplex Networks

Author: Barber
Fenn
Girvan
J.-P. Onnela
K. Macon
Leicht
M. A. Porter
Newman
P. J. Mucha
Palla
Reichardt
Richardson
T. Richardson
Traag
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 01/01/2010
Field of study

Network science is an interdisciplinary endeavor, with methods and applications drawn from across the natural, social, and information sciences. A prominent problem in network science is the algorithmic detection of tightly-connected groups of nodes known as communities. We developed a generalized framework of network quality functions that allowed us to study the community structure of arbitrary multislice networks, which are combinations of individual networks coupled through links that connect each node in one network slice to itself in other slices. This framework allows one to study community structure in a very general setting encompassing networks that evolve over time, have multiple types of links (multiplexity), and have multiple scales.Comment: 31 pages, 3 figures, 1 table. Includes main text and supporting material. This is the accepted version of the manuscript (the definitive version appeared in Science), with typographical corrections included her

arXiv.org e-Print Archive

CiteSeerX

Crossref

OpenSIUC

Oxford University Research Archive

Distributed Community Detection in Dynamic Graphs

Author: A. Chaintreau
A. Condon
A.E. Clementi
G. Cordasco
H. Baumann
J. Whitbeck
K. Kothapalli
M. Dyer
M. Girvan
P.W. Holland
S. Boccaletti
T.N. Bui
X. Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Inspired by the increasing interest in self-organizing social opportunistic networks, we investigate the problem of distributed detection of unknown communities in dynamic random graphs. As a formal framework, we consider the dynamic version of the well-studied \emph{Planted Bisection Model} \sdG(n,p,q) where the node set

[n]

of the network is partitioned into two unknown communities and, at every time step, each possible edge

(u,v)

is active with probability

p

if both nodes belong to the same community, while it is active with probability

q

(with

q<<p

) otherwise. We also consider a time-Markovian generalization of this model. We propose a distributed protocol based on the popular \emph{Label Propagation Algorithm} and prove that, when the ratio

p/q

is larger than

n^{b}

(for an arbitrarily small constant

b>0

), the protocol finds the right "planted" partition in

O(\log n)

time even when the snapshots of the dynamic graph are sparse and disconnected (i.e. in the case

p=\Theta(1/n)

).Comment: Version I

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

MPG.PuRe

Hal-Diderot

Distance, dissimilarity index, and network community structure

Author: A. Bairoch
C. von Mering
C.M. Deane
E. Ravasz
H. Zhou
H.W. Mewes
Haijun Zhou
I. Xenarios
L.C. Freeman
M. Girvan
P. Uetz
W.W. Zachary
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2003
Field of study

We address the question of finding the community structure of a complex network. In an earlier effort [H. Zhou, {\em Phys. Rev. E} (2003)], the concept of network random walking is introduced and a distance measure defined. Here we calculate, based on this distance measure, the dissimilarity index between nearest-neighboring vertices of a network and design an algorithm to partition these vertices into communities that are hierarchically organized. Each community is characterized by an upper and a lower dissimilarity threshold. The algorithm is applied to several artificial and real-world networks, and excellent results are obtained. In the case of artificially generated random modular networks, this method outperforms the algorithm based on the concept of edge betweenness centrality. For yeast's protein-protein interaction network, we are able to identify many clusters that have well defined biological functions.Comment: 10 pages, 7 figures, REVTeX4 forma

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Exploiting Resolution-based Representations for MaxSAT Solving

Author: A Gelder Van
A Morgado
A Morgado
C Ansótegui
M Girvan
M Janota
M Järvisalo
M Koshimura
O Bailleux
O Kullmann
R Asín
R Martins
R Martins
R Martins
RA Yates
V Blondel
Z Fu
Publication venue
Publication date: 10/05/2015
Field of study

Most recent MaxSAT algorithms rely on a succession of calls to a SAT solver in order to find an optimal solution. In particular, several algorithms take advantage of the ability of SAT solvers to identify unsatisfiable subformulas. Usually, these MaxSAT algorithms perform better when small unsatisfiable subformulas are found early. However, this is not the case in many problem instances, since the whole formula is given to the SAT solver in each call. In this paper, we propose to partition the MaxSAT formula using a resolution-based graph representation. Partitions are then iteratively joined by using a proximity measure extracted from the graph representation of the formula. The algorithm ends when only one partition remains and the optimal solution is found. Experimental results show that this new approach further enhances a state of the art MaxSAT solver to optimally solve a larger set of industrial problem instances

arXiv.org e-Print Archive

Crossref

Managing clustering effects and learning effects in the design and analysis of multicentre randomised trials: a survey to establish current practice.

Author: Blazeby Jane M
Burnside Girvan
Conroy Elizabeth J
Cook Jonathan A
Gamble Carrol
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/09/2019
Field of study

BACKGROUND:Patient outcomes can depend on the treating centre, or health professional, delivering the intervention. A health professional's skill in delivery improves with experience, meaning that outcomes may be associated with learning. Considering differences in intervention delivery at trial design will ensure that any appropriate adjustments can be made during analysis. This work aimed to establish practice for the allowance of clustering and learning effects in the design and analysis of randomised multicentre trials. METHODS:A survey that drew upon quotes from existing guidelines, references to relevant publications and example trial scenarios was delivered. Registered UK Clinical Research Collaboration Registered Clinical Trials Units were invited to participate. RESULTS:Forty-four Units participated (N = 50). Clustering was managed through design by stratification, more commonly by centre than by treatment provider. Managing learning by design through defining a minimum expertise level for treatment provider was common (89%). One-third reported experience in expertise-based designs. The majority of Units had adjusted for clustering during analysis, although approaches varied. Analysis of learning was rarely performed for the main analysis (n = 1), although it was explored by other means. The insight behind the approaches used within and reasons for, or against, alternative approaches were provided. CONCLUSIONS:Widespread awareness of challenges in designing and analysing multicentre trials is identified. Approaches used, and opinions on these, vary both across and within Units, indicating that approaches are dependent on the type of trial. Agreeing principles to guide trial design and analysis across a range of realistic clinical scenarios should be considered

University of Liverpool Repository

Oxford University Research Archive

Explore Bristol Research