Search CORE

156,984 research outputs found

Model-based clustering for populations of networks

Author: Signorelli Mirko
Wit Ernst
Publication venue: 'SAGE Publications'
Publication date: 01/01/2020
Field of study

Until recently obtaining data on populations of networks was typically rare. However, with the advancement of automatic monitoring devices and the growing social and scientific interest in networks, such data has become more widely available. From sociological experiments involving cognitive social structures to fMRI scans revealing large-scale brain networks of groups of patients, there is a growing awareness that we urgently need tools to analyse populations of networks and particularly to model the variation between networks due to covariates. We propose a model-based clustering method based on mixtures of generalized linear (mixed) models that can be employed to describe the joint distribution of a populations of networks in a parsimonious manner and to identify subpopulations of networks that share certain topological properties of interest (degree distribution, community structure, effect of covariates on the presence of an edge, etc.). Maximum likelihood estimation for the proposed model can be efficiently carried out with an implementation of the EM algorithm. We assess the performance of this method on simulated data and conclude with an example application on advice networks in a small business.Comment: The final (published) version of the article can be downloaded for free (Open Access) from the editor's website (click on the DOI link below

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Leiden University Scholary Publications

Dissertations of the University of Groningen

Model selection for semi-supervised clustering

Author: Campello Ricardo José Gabrielli Barreto
Goebel Randy
Moulavi Davoud
Pourrajabi Mojgan
Sander Jörg
Zimek Arthur
Publication venue: Athens
Publication date: 01/01/2014
Field of study

Although there is a large and growing literature that tackles the semi-supervised clustering problem (i.e., using some labeled objects or cluster-guiding constraints like \must-link" or \cannot-link"), the evaluation of semi-supervised clustering approaches has rarely been discussed. The application of cross-validation techniques, for example, is far from straightforward in the semi-supervised setting, yet the problems associated with evaluation have yet to be addressed. Here we\ud summarize these problems and provide a solution.\ud Furthermore, in order to demonstrate practical applicability of semi-supervised clustering methods, we provide a method for model selection in semi-supervised clustering based on this sound evaluation procedure. Our method allows the user to select, based on the available information\ud (labels or constraints), the most appropriate clustering model (e.g., number of clusters, density-parameters) for a given problem.NSERC (Canada)FAPESP (Brazil)CNPq (Brazil

ResearchOnline at James Cook University

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Universidade de São Paulo

Popularity versus Similarity in Growing Networks

Author: A Clauset
A Vázquez
A-L Barabási
AE Motter
AFJ van Raan
AK Menon
B Bollobás
D Crandall
DJ Watts
Dmitri Krioukov
F Bonahon
F Menczer
F Menczer
Fragkiskos Papadopoulos
G Bianconi
G Caldarelli
H Jeong
K Börner
LA Adamic
M McPherson
M. Ángeles Serrano
Maksim Kitsak
Marián Boguñá
MEJ Newman
O Simşek
PL Krapivsky
R Pastor-Satorras
R Pastor-Satorras
RM D'Souza
S Fortunato
S Redner
SN Dorogovtsev
SN Dorogovtsev
SN Dorogovtsev
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/04/2013
Field of study

Popularity is attractive -- this is the formula underlying preferential attachment, a popular explanation for the emergence of scaling in growing networks. If new connections are made preferentially to more popular nodes, then the resulting distribution of the number of connections that nodes have follows power laws observed in many real networks. Preferential attachment has been directly validated for some real networks, including the Internet. Preferential attachment can also be a consequence of different underlying processes based on node fitness, ranking, optimization, random walks, or duplication. Here we show that popularity is just one dimension of attractiveness. Another dimension is similarity. We develop a framework where new connections, instead of preferring popular nodes, optimize certain trade-offs between popularity and similarity. The framework admits a geometric interpretation, in which popularity preference emerges from local optimization. As opposed to preferential attachment, the optimization framework accurately describes large-scale evolution of technological (Internet), social (web of trust), and biological (E.coli metabolic) networks, predicting the probability of new links in them with a remarkable precision. The developed framework can thus be used for predicting new links in evolving networks, and provides a different perspective on preferential attachment as an emergent phenomenon

arXiv.org e-Print Archive

Crossref

Mixture Models With Grouping Structure: Retail Analytics Applications

Author: Almohri Haidar
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2018
Field of study

Growing competitiveness and increasing availability of data is generating tremendous interest in data-driven analytics across industries. In the retail sector, stores need targeted guidance to improve both the efficiency and effectiveness of individual stores based on their specific location, demographics, and environment. We propose an effective data-driven framework for internal benchmarking that can lead to targeted guidance for individual stores. In particular, we propose an objective method for segmenting stores using a model-based clustering technique that accounts for similarity in store performance dynamics. It relies on effective Finite Mixture of Regression (FMR) techniques for carrying out the model-based clustering with grouping structure (`must-link\u27 constraints) and modeling store performance. We propose two alternate methods for FMR with grouping structure: 1) Competitive Learning (CL) and 2) Expectation Maximization (EM). The CL method can support both linear and non-linear regression methods whereas the more effective proposed EM approach only supports linear regression. We also propose an optimization framework to derive tailored recommendations for individual stores within store clusters that jointly improves profitability for the store while also improving sales to satisfy franchiser requirements. We validate the methods using synthetic experiments as well as a real-world automotive dealership network study for a leading global automotive manufacturer

Digital Commons@Wayne State University