Search CORE

8 research outputs found

Speeding up the Consensus Clustering methodology for microarray data analysis

Author: A Ben-Hur
A Bertoni
A Bertoni
A Borodin
A Jain
AK Jain
B Everitt
B Mirkin
E Levine
Filippo Utro
G Frahling
G Milligan
J Handl
J Kraus
JA Hartigan
JA Rice
JP Brunet
K Devarajan
K Yeung
L Kaufman
P Bertrand
P D'haeseleer
P Hansen
R Giancarlo
R Shamir
R Tibshirani
Raffaele Giancarlo
S Dudoit
S Dudoit
S Klie
S Monti
S Salvador
S Seal
T Hastie
TP Speed
V Di Gesú
V Roth
W Krzanowski
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The inference of the number of clusters in a dataset, a fundamental problem in Statistics, Data Analysis and Classification, is usually addressed via internal validation measures. The stated problem is quite difficult, in particular for microarrays, since the inferred prediction must be sensible enough to capture the inherent biological structure in a dataset, e.g., functionally related genes. Despite the rich literature present in that area, the identification of an internal validation measure that is both fast and precise has proved to be elusive. In order to partially fill this gap, we propose a speed-up of <monospace>Consensus</monospace> (Consensus Clustering), a methodology whose purpose is the provision of a prediction of the number of clusters in a dataset, together with a dissimilarity matrix (the consensus matrix) that can be used by clustering algorithms. As detailed in the remainder of the paper, <monospace>Consensus</monospace> is a natural candidate for a speed-up. Results Since the time-precision performance of <monospace>Consensus</monospace> depends on two parameters, our first task is to show that a simple adjustment of the parameters is not enough to obtain a good precision-time trade-off. Our second task is to provide a fast approximation algorithm for <monospace>Consensus</monospace>. That is, the closely related algorithm <monospace>FC</monospace> (Fast Consensus) that would have the same precision as <monospace>Consensus</monospace> with a substantially better time performance. The performance of <monospace>FC</monospace> has been assessed via extensive experiments on twelve benchmark datasets that summarize key features of microarray applications, such as cancer studies, gene expression with up and down patterns, and a full spectrum of dimensionality up to over a thousand. Based on their outcome, compared with previous benchmarking results available in the literature, <monospace>FC</monospace> turns out to be among the fastest internal validation methods, while retaining the same outstanding precision of <monospace>Consensus</monospace>. Moreover, it also provides a consensus matrix that can be used as a dissimilarity matrix, guaranteeing the same performance as the corresponding matrix produced by <monospace>Consensus</monospace>. We have also experimented with the use of <monospace>Consensus</monospace> and <monospace>FC</monospace> in conjunction with <monospace>NMF</monospace> (Nonnegative Matrix Factorization), in order to identify the correct number of clusters in a dataset. Although <monospace>NMF</monospace> is an increasingly popular technique for biological data mining, our results are somewhat disappointing and complement quite well the state of the art about <monospace>NMF</monospace>, shedding further light on its merits and limitations. Conclusions In summary, <monospace>FC</monospace> with a parameter setting that makes it robust with respect to small and medium-sized datasets, i.e, number of items to cluster in the hundreds and number of conditions up to a thousand, seems to be the internal validation measure of choice. Moreover, the technique we have developed here can be used in other contexts, in particular for the speed-up of stability-based validation measures.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Palermo

Computational cluster validation for microarray data analysis: experimental assessment of Clest, Consensus Clustering, Figure of Merit, Gap Statistics and Model Explorer

Author: A Alizadeh
A Ben-Hur
A Jain
A Kapp
AD Gordon
AK Jain
B Everitt
B Mirkin
CV Rijsbergen
Davide Scaturro
E Fowlkes
E Hartuv
Filippo Utro
GJ McLachlan
GW Milligan
I Priness
J Handl
JA Hartigan
JA Rice
JN Breckenridge
KY Yeung
L Hubert
L Kaufman
M Yan
P Hansen
PT Spellman
R Shamir
R Tibshirani
Raffaele Giancarlo
S Datta
S Dudoit
S Monti
T Hastie
V Di Gesú
W Krzanowski
X Wen
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

This is an Open Access article distributed under the terms of the Creative Commons Attribution Licens

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Palermo

Local operators to detect regions of interest

Author: Di Gesú V.
Strinati L.
VALENTI Cesare Fabio
Publication venue: 'Elsevier BV'
Publication date: 01/01/1997
Field of study

The performance of a visual system is strongly influenced by the information processing that is done in the early vision phase. The need exists to limit the computation on areas of interest to reduce the total amount of data and their redundancy. This paper describes a new method to drive the attention during the analysis of complex scenes. Two new local operators, based on the computation of local moments and symmetries, are combined to drive the selection. Experimental results on real data are also reported. © 1997 Elsevier Science B.V

Archivio istituzionale della ricerca - Università di Palermo

Representing 2D Digital Objects

Author: A. Chella
D. S. Arnon
E. Barcucci
E. D. Khalimsky
F. Chiavetta
G. E. Collins
J. M. Françon
S. Arnborg
V. A. Kovalevsky
V. Gesú Di
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Δ-distance: A family of dissimilarity metrics between images represented by multi-level feature vectors

Author: A. Gupta
B. L. Gottesfeld
Geneviève Jomier
H.-K. Kim
I. Ahmad
M. L. Kherfi
Marta Rukoz
Maude Manouvrier
V. Di Gesú
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Image-based rendering of intersecting surfaces for dynamic comparative visualization

Author: A. Bair
A. Mammen
C. Weigle
C.R. Johnson
Charl P. Botha
D. Rey
D.L. Wilson
E. Pichon
Frits H. Post
G. Guennebaud
G. Subsol
H.G. Pagendarm
I.S. Lim
J. Goldfeather
Julien Milles
L. Bavoil
L. Ferrarini
L. Williams
Luca Ferrarini
M. Deering
M. Nienhaus
M. Tory
P. Rheingans
P.A.V. Miranda
R. Likert
R. Peikert
R.C. Veltkamp
S. Bruckner
S. Bruckner
S. Busking
S. Busking
Stef Busking
T. Gatzke
T. Masuda
T. Saito
T.F. Cootes
T.F. Wiegand
V. Gesú di
V. Interrante
X. Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/12/2010
Field of study

Nested or intersecting surfaces are proven techniques for visualizing shape differences between static 3D objects (Weigle and Taylor II, IEEE Visualization, Proceedings, pp. 503–510, 2005). In this paper we present an image-based formulation for these techniques that extends their use to dynamic scenarios, in which surfaces can be manipulated or even deformed interactively. The formulation is based on our new layered rendering pipeline, a generic image-based approach for rendering nested surfaces based on depth peeling and deferred shading. We use layered rendering to enhance the intersecting surfaces visualization. In addition to enabling interactive performance, our enhancements address several limitations of the original technique. Contours remove ambiguity regarding the shape of intersections. Local distances between the surfaces can be visualized at any point using either depth fogging or distance fields: Depth fogging is used as a cue for the distance between two surfaces in the viewing direction, whereas closest-point distance measures are visualized interactively by evaluating one surface’s distance field on the other surface. Furthermore, we use these measures to define a three-way surface segmentation, which visualizes regions of growth, shrinkage, and no change of a test surface compared with a reference surface. Finally, we demonstrate an application of our technique in the visualization of statistical shape models. We evaluate our technique based on feedback provided by medical image analysis researchers, who are experts in working with such models.Intelligent SystemsElectrical Engineering, Mathematics and Computer Scienc

Crossref

TU Delft Repository

Leiden University Scholary Publications

Bayesian versus data driven model selection for microarray data

Author: A Alizadeh
A Jain
A Jain
A Su
B Andreopoulos
B Everitt
C Perou
CS Wallace
CS Wallace
CV Rijsbergen
D Ross
E Fowlkes
EJ Yeoh
Filippo Utro
G Schwarz
Giosué Lo Bosco
H Akaike
H Liu
I Priness
J Breckenridge
J Handl
J Hartigan
J Pollack
J Quackenbush
J Rissanen
K Yeung
L Hubert
L Kaufman
MAT Figuereido
N Bouguila
P D’haeseleer
P Spellman
R Giancarlo
R Giancarlo
R Giancarlo
R Giancarlo
R Giancarlo
R Giancarlo
R Shamir
R Tibshirani
Raffaele Giancarlo
S Dudoit
S Klie
S Monti
T Golub
T Hastie
U Alon
V Gesú Di
W Krzanowski
X Wen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref