Search CORE

1,474 research outputs found

LinkCluE: A MATLAB Package for Link-Based Cluster Ensembles

Author: Natthakan Iam-on
Simon Garrett
Publication venue
Publication date
Field of study

Cluster ensembles have emerged as a powerful meta-learning paradigm that provides improved accuracy and robustness by aggregating several input data clusterings. In particular, link-based similarity methods have recently been introduced with superior performance to the conventional co-association approach. This paper presents a MATLAB package, LinkCluE, that implements the link-based cluster ensemble framework. A variety of functional methods for evaluating clustering results, based on both internal and external criteria, are also provided. Additionally, the underlying algorithms together with the sample uses of the package with interesting real and synthetic datasets are demonstrated herein.

Research Papers in Economics

Stochastic Data Clustering

Author: Meyer Carl D.
Wessell Charles D.
Publication venue
Publication date: 01/01/2012
Field of study

In 1961 Herbert Simon and Albert Ando published the theory behind the long-term behavior of a dynamical system that can be described by a nearly uncoupled matrix. Over the past fifty years this theory has been used in a variety of contexts, including queueing theory, brain organization, and ecology. In all these applications, the structure of the system is known and the point of interest is the various stages the system passes through on its way to some long-term equilibrium. This paper looks at this problem from the other direction. That is, we develop a technique for using the evolution of the system to tell us about its initial structure, and we use this technique to develop a new algorithm for data clustering.Comment: 23 page

arXiv.org e-Print Archive

CiteSeerX

Gettysburg College

Robust Detection of Hierarchical Communities from Escherichia coli Gene Expression Data

Author: A Beyer
AL Barabási
BH Good
BW Kernighan
CO Daub
D Duewer
D Marbach
DFT Veiga
E Bonnet
E Ravasz
E Segal
EH Davidson
F Luo
G Balázsi
G Getz
G Palla
G Palla
H Zare
HW Ma
J Chen
J Duch
J Hubble
J Lemke
J Reichardt
JJ Faith
JJ Faith
JN Weinstein
K Baggerly
Kevin E. Bassler
KY Yeung
M Blatt
M Riley
MB Eisen
MEJ Newman
MEJ Newman
MF Traxler
MM Barker
N Friedman
N Friedman
O Alter
PD Karp
Q Lu
R Guimerà
RA Irizarry
S Fortunato
S Fortunato
S Gama-Castro
S Raychaudhuri
S Tavazoie
Santiago Treviño
Satoru Miyano
SB Seidman
SB Seidman
SP Borgatii
SP Borgatii
TF Cooper
Tim F. Cooper
TS Gardner
U Brandes
UN Raghavan
X Wen
Y Benjamini
Y Sun
Yudong Sun
Z Shi
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 11/01/2012
Field of study

Determining the functional structure of biological networks is a central goal of systems biology. One approach is to analyze gene expression data to infer a network of gene interactions on the basis of their correlated responses to environmental and genetic perturbations. The inferred network can then be analyzed to identify functional communities. However, commonly used algorithms can yield unreliable results due to experimental noise, algorithmic stochasticity, and the influence of arbitrarily chosen parameter values. Furthermore, the results obtained typically provide only a simplistic view of the network partitioned into disjoint communities and provide no information of the relationship between communities. Here, we present methods to robustly detect coregulated and functionally enriched gene communities and demonstrate their application and validity for Escherichia coli gene expression data. Applying a recently developed community detection algorithm to the network of interactions identified with the context likelihood of relatedness (CLR) method, we show that a hierarchy of network communities can be identified. These communities significantly enrich for gene ontology (GO) terms, consistent with them representing biologically meaningful groups. Further, analysis of the most significantly enriched communities identified several candidate new regulatory interactions. The robustness of our methods is demonstrated by showing that a core set of functional communities is reliably found when artificial noise, modeling experimental noise, is added to the data. We find that noise mainly acts conservatively, increasing the relatedness required for a network link to be reliably assigned and decreasing the size of the core communities, rather than causing association of genes into new communities.Comment: Due to appear in PLoS Computational Biology. Supplementary Figure S1 was not uploaded but is available by contacting the author. 27 pages, 5 figures, 15 supplementary file

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Link-Prediction Enhanced Consensus Clustering for Complex Networks

Author: Adar Eytan
Burgess Matthew
Cafarella Michael
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 04/06/2015
Field of study

Many real networks that are inferred or collected from data are incomplete due to missing edges. Missing edges can be inherent to the dataset (Facebook friend links will never be complete) or the result of sampling (one may only have access to a portion of the data). The consequence is that downstream analyses that consume the network will often yield less accurate results than if the edges were complete. Community detection algorithms, in particular, often suffer when critical intra-community edges are missing. We propose a novel consensus clustering algorithm to enhance community detection on incomplete networks. Our framework utilizes existing community detection algorithms that process networks imputed by our link prediction based algorithm. The framework then merges their multiple outputs into a final consensus output. On average our method boosts performance of existing algorithms by 7% on artificial data and 17% on ego networks collected from Facebook

arXiv.org e-Print Archive

Directory of Open Access Journals

Ensemble Clustering for Biological Datasets

Author: Harun Pirim
Şadi Evren Şeker
Publication venue: 'IntechOpen'
Publication date: 28/11/2012
Field of study

IntechOpen

Extending data reliability measure to a filter approach for soft subspace clustering

Author: Boongoen Tossapon
Iam-On Natthakan
Shang Changjing
Shen Qiang
Publication venue
Publication date: 12/12/2011
Field of study

Aberystwyth Research Portal