Search CORE

45 research outputs found

Differential co-expression framework to quantify goodness of biclusters and compare biclustering algorithms

Author: A Ben-Dor
A Ben-Dor
A Prelic
A Tanay
A Wille
AA Alizadeh
AP Gasch
B Mirkin
Burton Kuan Hui Chia
D Kostka
Golub
I Ulitsky
IS Dhillon
J Ihmels
J Yang
P Broët
R Krishna Murthy Karuturi
S Barkow
SC Madiera
W Ayadi
X Chen
Y Cheng
Y Kluger
Y Wang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Biclustering is an important analysis procedure to understand the biological mechanisms from microarray gene expression data. Several algorithms have been proposed to identify biclusters, but very little effort was made to compare the performance of different algorithms on real datasets and combine the resultant biclusters into one unified ranking. Results In this paper we propose differential co-expression framework and a differential co-expression scoring function to objectively quantify quality or goodness of a bicluster of genes based on the observation that genes in a bicluster are co-expressed in the conditions belonged to the bicluster and not co-expressed in the other conditions. Furthermore, we propose a scoring function to stratify biclusters into three types of co-expression. We used the proposed scoring functions to understand the performance and behavior of the four well established biclustering algorithms on six real datasets from different domains by combining their output into one unified ranking. Conclusions Differential co-expression framework is useful to provide quantitative and objective assessment of the goodness of biclusters of co-expressed genes and performance of biclustering algorithms in identifying co-expression biclusters. It also helps to combine the biclusters output by different algorithms into one unified ranking i.e. meta-biclustering.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

QUBIC: a qualitative biclustering algorithm for analyses of gene expression data

Author: Aguilar-Ruiz
Andrew H. Paterson
Armstrong
Ashburner
Barkow
Ben-dor
Bryan
Bryan
Bryan
Carmona-Saez
Castillo-Davis
Cheng
Eisen
Faith
Gasch
Getz
Golub
Guojun Li
Haibao Tang
Hartigan
Huttenhower
Ihmels
Kanehisa
Keseler
Kluger
Kung
Li
Liu
Madeira
McLachlan
Morgan
Murali
Prelic
Qin Ma
Reiss
Ruepp
Shamir
Tanay
Xu
Yeung
Ying Xu
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Biclustering extends the traditional clustering techniques by attempting to find (all) subgroups of genes with similar expression patterns under to-be-identified subsets of experimental conditions when applied to gene expression data. Still the real power of this clustering strategy is yet to be fully realized due to the lack of effective and efficient algorithms for reliably solving the general biclustering problem. We report a QUalitative BIClustering algorithm (QUBIC) that can solve the biclustering problem in a more general form, compared to existing algorithms, through employing a combination of qualitative (or semi-quantitative) measures of gene expression data and a combinatorial optimization technique. One key unique feature of the QUBIC algorithm is that it can identify all statistically significant biclusters including biclusters with the so-called ‘scaling patterns’, a problem considered to be rather challenging; another key unique feature is that the algorithm solves such general biclustering problems very efficiently, capable of solving biclustering problems with tens of thousands of genes under up to thousands of conditions in a few minutes of the CPU time on a desktop computer. We have demonstrated a considerably improved biclustering performance by our algorithm compared to the existing algorithms on various benchmark sets and data sets of our own. QUBIC was written in ANSI C and tested using GCC (version 4.1.2) on Linux. Its source code is available at: http://csbl.bmb.uga.edu/∼maqin/bicluster. A server version of QUBIC is also available upon request

CiteSeerX

Crossref

PubMed Central

DeBi: Discovering Differentially Expressed Biclusters using a Frequent Itemset Approach

Author: A Ben-Dor
A Prelic
A Rosenwald
A Tanay
AD Basehoar
Akdes Serin
B Andreopoulos
BKH Chia
CT Harbison
D Burdick
DR Ciocca
G Li
GA Grothaus
J Lamb
JA Hartigan
JA Hartigan
JL Jensen
JN Keller
KD MacIsaac
Martin Vingron
R Shamir
RR Sokal
S Barkow
S Bergmann
S Hochreiter
SC Madeira
TM Murali
TR Hughes
XG Ni
Y Cheng
Y Hoshida
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The analysis of massive high throughput data via clustering algorithms is very important for elucidating gene functions in biological systems. However, traditional clustering methods have several drawbacks. Biclustering overcomes these limitations by grouping genes and samples simultaneously. It discovers subsets of genes that are co-expressed in certain samples. Recent studies showed that biclustering has a great potential in detecting marker genes that are associated with certain tissues or diseases. Several biclustering algorithms have been proposed. However, it is still a challenge to find biclusters that are significant based on biological validation measures. Besides that, there is a need for a biclustering algorithm that is capable of analyzing very large datasets in reasonable time. Results Here we present a fast biclustering algorithm called DeBi (Differentially Expressed BIclusters). The algorithm is based on a well known data mining approach called frequent itemset. It discovers maximum size homogeneous biclusters in which each gene is strongly associated with a subset of samples. We evaluate the performance of DeBi on a yeast dataset, on synthetic datasets and on human datasets. Conclusions We demonstrate that the DeBi algorithm provides functionally more coherent gene sets compared to standard clustering or biclustering algorithms using biological validation measures such as Gene Ontology term and Transcription Factor Binding Site enrichment. We show that DeBi is a computationally efficient and powerful tool in analyzing large datasets. The method is also applicable on multiple gene expression datasets coming from different labs or platforms.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MPG.PuRe

FABIA: factor analysis for bicluster acquisition

Author: Adetayo Kasim
Andreas Mayr
Andreas Mitterecker
Barkow
Ben-Dor
Bithas
Busygin
Caldas
Califano
Cheng
Dan Lin
Dempster
Djork-Arné Clevert
Everitt
Gan
Getoor
Getz
Girolami
Gu
Hardn
Hartigan
Hinrich W. H. Göhlmann
Hochreiter
Hoshida
Hoyer
Hyvärinen
Hyvärinen
Ihmels
Kaiser
Kluger
Lazzeroni
Li
Luc Bijnens
Madeira
Madeira
Madeira
Martin Heusel
Munkres
Murali
Palmer
Prelic
Reiss
Rosenwald
Sepp Hochreiter
Shamir
Sheng
Su
Suzy Van Sanden
Talloen
Tanay
Tang
Tatsiana Khamiakova
Tibshirani
Turner
Ulrich Bodenhofer
Van den Bulcke
van't Veer
Wang
Willem Talloen
Yang
Ziv Shkedy
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Motivation: Biclustering of transcriptomic data groups genes and samples simultaneously. It is emerging as a standard tool for extracting knowledge from gene expression measurements. We propose a novel generative approach for biclustering called ‘FABIA: Factor Analysis for Bicluster Acquisition’. FABIA is based on a multiplicative model, which accounts for linear dependencies between gene expression and conditions, and also captures heavy-tailed distributions as observed in real-world transcriptomic data. The generative framework allows to utilize well-founded model selection methods and to apply Bayesian techniques

CiteSeerX

Crossref

PubMed Central

A visual analytics approach for understanding biclustering results from microarray data

Author: A Ben-Dor
A Inselberg
A Perer
A Prelic
AJ Saldanha
B Fry
C Ware
CA Duncan
ER Gansner
G Kumar
GA Grothaus
GG Tevzadze
Herman
HL Turner
HW Ma
J Heer
J Ihmels
J Ihmels
J Seo
JMvan der Vaart
KO Cheng
Luis Quintales
M Ashburner
M Rasmussen
MA Hibbs
MB Eisen
MC Schatz
National Visualization and Analytics Center
P Shannon
R Santamaría
R Theron
Roberto Therón
Rodrigo Santamaría
S Barkow
S Madeira
SS Shen-Orr
TMJ Fruchterman
TV den Bulcke
Y Kluger
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Microarray analysis is an important area of bioinformatics. In the last few years, biclustering has become one of the most popular methods for classifying data from microarrays. Although biclustering can be used in any kind of classification problem, nowadays it is mostly used for microarray data classification. A large number of biclustering algorithms have been developed over the years, however little effort has been devoted to the representation of the results. Results We present an interactive framework that helps to infer differences or similarities between biclustering results, to unravel trends and to highlight robust groupings of genes and conditions. These linked representations of biclusters can complement biological analysis and reduce the time spent by specialists on interpreting the results. Within the framework, besides other standard representations, a visualization technique is presented which is based on a force-directed graph where biclusters are represented as flexible overlapped groups of genes and conditions. This microarray analysis framework (BicOverlapper), is available at http://vis.usal.es/bicoverlapper Conclusion The main visualization technique, tested with different biclustering results on a real dataset, allows researchers to extract interesting features of the biclustering results, especially the highlighting of overlapping zones that usually represent robust groups of genes and/or conditions. The visual analytics methodology will permit biology experts to study biclustering results without inspecting an overwhelming number of biclusters individually.</p

Crossref

Springer

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Construction of gene regulatory networks using biclustering and bayesian networks

Author: A Ben-Dor
A Faisal
A Prelic
A Tanay
AC Lozano
AP Gasch
C Wolfe
CT Ronald
D Jesse
D Reiss
F Azuaje
Fadhl M Alakwaa
FM Al-Akwaa
FM Alakwaa
G Bader
G Fung
G Stolovitzky
I Avila-Campillo
J Ihmels
KO Cheng
MD Dyer
N Friedman
Nahed H Solouma
O Troyanskaya
P D haeseleer
P D'haeseleer
P Shannon
Pe Dana
PTSG Spellman
R Bonneau
R Guthke
S Barkow
S Datta
S Kauffman
S Maere
S Tavazoie
SC Madeira
T Chen
TM Murali
X Liu
Xw Chen
Y Assenov
Y Cheng
Yasser M Kadah
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Understanding gene interactions in complex living systems can be seen as the ultimate goal of the systems biology revolution. Hence, to elucidate disease ontology fully and to reduce the cost of drug development, gene regulatory networks (GRNs) have to be constructed. During the last decade, many GRN inference algorithms based on genome-wide data have been developed to unravel the complexity of gene regulation. Time series transcriptomic data measured by genome-wide DNA microarrays are traditionally used for GRN modelling. One of the major problems with microarrays is that a dataset consists of relatively few time points with respect to the large number of genes. Dimensionality is one of the interesting problems in GRN modelling. Results In this paper, we develop a biclustering function enrichment analysis toolbox (BicAT-plus) to study the effect of biclustering in reducing data dimensions. The network generated from our system was validated via available interaction databases and was compared with previous methods. The results revealed the performance of our proposed method. Conclusions Because of the sparse nature of GRNs, the results of biclustering techniques differ significantly from those of previous methods.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A biclustering algorithm based on a Bicluster Enumeration Tree: application to DNA microarray data

Author: A Ben-Dor
A Dharan
A Prelic
A Schliep
A Tanay
A Yip
B Pontes
C Cano
C Gallo
DD Lewis
EL Lehmann
F Angiulli
F Divina
GF Berriz
H Turner
H Wang
IS Dhillon
J Liu
J Yang
JA Hartigan
Jin-Kao Hao
JS Aguilar-Ruiz
K Bryan
K Cheng
L Lazzeroni
L Teng
Mourad Elloumi
R Agrawal
R Balasubramaniyan
S Barkow
S Bergmann
S Bleuler
S Mitra
S Tavazoie
SC Madeira
SC Madeira
SD Peddada
T Hofmann
U Maulik
W Gaul
Wassim Ayadi
X Liu
Y Cheng
Y Cheng
Y Christinat
Y Luan
Y Okada
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background In a number of domains, like in DNA microarray data analysis, we need to cluster simultaneously rows (genes) and columns (conditions) of a data matrix to identify groups of rows coherent with groups of columns. This kind of clustering is called <it>biclustering</it>. Biclustering algorithms are extensively used in DNA microarray data analysis. More effective biclustering algorithms are highly desirable and needed. Methods We introduce <it>BiMine</it>, a new enumeration algorithm for biclustering of DNA microarray data. The proposed algorithm is based on three original features. First, <it>BiMine </it>relies on a new evaluation function called <it>Average Spearman's rho </it>(ASR). Second, <it>BiMine </it>uses a new tree structure, called <it>Bicluster Enumeration Tree </it>(BET), to represent the different biclusters discovered during the enumeration process. Third, to avoid the combinatorial explosion of the search tree, <it>BiMine </it>introduces a parametric rule that allows the enumeration process to cut tree branches that cannot lead to good biclusters. Results The performance of the proposed algorithm is assessed using both synthetic and real DNA microarray data. The experimental results show that <it>BiMine </it>competes well with several other biclustering methods. Moreover, we test the biological significance using a gene annotation web-tool to show that our proposed method is able to produce biologically relevant biclusters. The software is available upon request from the authors to academic users.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Okina

Hal-Diderot

BicPAMS: software for biological data analysis with pattern-based biclustering

Author: A Ben-Dor
A Rosenwald
A Serin
A Tanay
AP Gasch
AP Lee
D Szklarczyk
Francisco L. Ferreira
G Getz
J Han
JLY Koh
K Eren
K Sim
M Charrad
MC Teixeira
MV Kuleshov
NR Mabroukeh
R Henriques
R Henriques
R Henriques
R Henriques
R Henriques
R Henriques
R Henriques
R Henriques
R Henriques
R Henriques
R Henriques
R Martinez
R Santamaría
Rui Henriques
S Barkow
S Hochreiter
Sara C. Madeira
SC Madeira
W Lee
Y Okada
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Relating gene expression data on two-component systems to functional annotations in Escherichia coli

Abstract Background Obtaining physiological insights from microarray experiments requires computational techniques that relate gene expression data to functional information. Traditionally, this has been done in two consecutive steps. The first step identifies important genes through clustering or statistical techniques, while the second step assigns biological functions to the identified groups. Recently, techniques have been developed that identify such relationships in a single step. Results We have developed an algorithm that relates patterns of gene expression in a set of microarray experiments to functional groups in one step. Our only assumption is that patterns co-occur frequently. The effectiveness of the algorithm is demonstrated as part of a study of regulation by two-component systems in <it>Escherichia coli</it>. The significance of the relationships between expression data and functional annotations is evaluated based on density histograms that are constructed using product similarity among expression vectors. We present a biological analysis of three of the resulting functional groups of proteins, develop hypotheses for further biological studies, and test one of these hypotheses experimentally. A comparison with other algorithms and a different data set is presented. Conclusion Our new algorithm is able to find interesting and biologically meaningful relationships, not found by other algorithms, in previously analyzed data sets. Scaling of the algorithm to large data sets can be achieved based on a theoretical model.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

BicSPAM: flexible biclustering using sequential patterns

Author: A Ben-Dor
A Califano
A Patrikainen
A Prelić
A Serin
A Tanay
AA Alizadeh
AR Donders
C Creighton
C Ding
C Tang
D Bozdağ
D Martin
DS Hochbaum
F Zhu
G Atluri
G Bebek
G Getz
G Pandey
GF Berriz
H Choi
H Toivonen
H Wang
J Bellay
J Han
J Ihmels
J Liu
J Liu
J Pei
J Wang
J Yang
JA Hartigan
K Sim
K Yip
L Lazzeroni
M Charrad
M de Souto
M Steinbach
MA Mahfouz
MJ Zaki
NR Mabroukeh
O Troyanskaya
P Carmona-Saez
P Fournier-Viger
Q Fang
Q Sheng
R Henriques
R Henriques
R Martinez
Rui Henriques
S Barkow
S Hochreiter
S Madeira
S Tavazoie
Sara C Madeira
SC Madeira
SS Young
T Calders
T Hellem
TR Golub
U Alon
X Yan
Y Huang
Y Okada
Y Okada
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref