Search CORE

BiGGEsTS: integrated environment for biclustering analysis of time series gene expression data

Author: Gonçalves Joana P
Madeira Sara C
Oliveira Arlindo L
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The ability to monitor changes in expression patterns over time, and to observe the emergence of coherent temporal responses using expression time series, is critical to advance our understanding of complex biological processes. Biclustering has been recognized as an effective method for discovering local temporal expression patterns and unraveling potential regulatory mechanisms. The general biclustering problem is NP-hard. In the case of time series this problem is tractable, and efficient algorithms can be used. However, there is still a need for specialized applications able to take advantage of the temporal properties inherent to expression time series, both from a computational and a biological perspective. Findings BiGGEsTS makes available state-of-the-art biclustering algorithms for analyzing expression time series. Gene Ontology (GO) annotations are used to assess the biological relevance of the biclusters. Methods for preprocessing expression time series and post-processing results are also included. The analysis is additionally supported by a visualization module capable of displaying informative representations of the data, including heatmaps, dendrograms, expression charts and graphs of enriched GO terms. Conclusion BiGGEsTS is a free open source graphical software tool for revealing local coexpression of genes in specific intervals of time, while integrating meaningful information on gene annotations. It is freely available at: <url>http://kdbio.inesc-id.pt/software/biggests</url>. We present a case study on the discovery of transcriptional regulatory modules in the response of <it>Saccharomyces cerevisiae </it>to heat stress.</p

International Journal for Computational Biology (IJCB)

DNA Microarray Data Analysis: A New Survey on Biclustering

Author: Ben Saber Haifa
ELLOUMI Mourad
Publication venue: International Journal for Computational Biology (IJCB)
Publication date: 03/04/2015
Field of study

There are subsets of genes that have similar behavior under subsets of conditions, so we say that they coexpress, but behave independently under other subsets of conditions. Discovering such coexpressions can be helpful to uncover genomic knowledge such as gene networks or gene interactions. That is why, it is of utmost importance to make a simultaneous clustering of genes and conditions to identify clusters of genes that are coexpressed under clusters of conditions. This type of clustering is called biclustering.Biclustering is an NP-hard problem. Consequently, heuristic algorithms are typically used to approximate this problem by finding suboptimal solutions. In this paper, we make a new survey on biclustering of gene expression data, also called microarray data

LateBiclustering: Efficient Heuristic Algorithm for Time-Lagged Bicluster Identification

Author: Madeira S.C.
Pinho Goncalves J.S. (Joana) de
Publication venue
Publication date: 01/01/2014
Field of study

Identifying patterns in temporal data is key to uncover meaningful relationships in diverse domains, from stock trading to social interactions. Also of great interest are clinical and biological applications, namely monitoring patient response to treatment or characterizing activity at the molecular level. In biology, researchers seek to gain insight into gene functions and dynamics of biological processes, as well as potential perturbations of these leading to disease, through the study of patterns emerging from gene expression time series. Clustering can group genes exhibiting similar expression profiles, but focuses on global patterns denoting rather broad, unspecific responses. Biclustering reveals local patterns, which more naturally capture the intricate collaboration between biological players, particularly under a temporal setting. Despite the general biclustering formulation being NP-hard, considering specific properties of time series has led to efficient solutions for the discovery of temporally aligned patterns. Notably, the identification of biclusters with time-lagged patterns, suggestive of transcriptional cascades, remains a challenge due to the combinatorial explosion of delayed occurrences. Herein, we propose LateBiclustering, a sensible heuristic algorithm enabling a polynomial rather than exponential time solution for the problem. We show that it identifies meaningful time-lagged biclusters relevant to the response of Saccharomyces cerevisiae to heat stress

CWI's Institutional Repository

Biclustering of Gene Expression Data by Correlation-Based Scatter Search

Author: Aguilar Ruiz Jesús Salvador
Nepomuceno Chamorro Juan Antonio
Troncoso Lora Alicia
Publication venue
Publication date: 01/01/2011
Field of study

BACKGROUND: The analysis of data generated by microarray technology is very useful to understand how the genetic information becomes functional gene products. Biclustering algorithms can determine a group of genes which are co-expressed under a set of experimental conditions. Recently, new biclustering methods based on metaheuristics have been proposed. Most of them use the Mean Squared Residue as merit function but interesting and relevant patterns from a biological point of view such as shifting and scaling patterns may not be detected using this measure. However, it is important to discover this type of patterns since commonly the genes can present a similar behavior although their expression levels vary in different ranges or magnitudes. METHODS: Scatter Search is an evolutionary technique that is based on the evolution of a small set of solutions which are chosen according to quality and diversity criteria. This paper presents a Scatter Search with the aim of finding biclusters from gene expression data. In this algorithm the proposed fitness function is based on the linear correlation among genes to detect shifting and scaling patterns from genes and an improvement method is included in order to select just positively correlated genes. RESULTS: The proposed algorithm has been tested with three real data sets such as Yeast Cell Cycle dataset, human B-cells lymphoma dataset and Yeast Stress dataset, finding a remarkable number of biclusters with shifting and scaling patterns. In addition, the performance of the proposed method and fitness function are compared to that of CC, OPSM, ISA, BiMax, xMotifs and Samba using Gene the Ontology Database

idUS. Depósito de Investigación Universidad de Sevilla

QServer: A Biclustering Server for Prediction and Assessment of Co-Expressed Gene Clusters

Author: A Prelic
C Huttenhower
CJ Wu
Dongxiao Zhu
E Mejia-Roa
F Zhou
F Zhou
Fengfeng Zhou
G Li
G Li
Guojun Li
J Seo
JJ Faith
JN Morgan
JP Goncalves
M Pruess
P Kersey
P Waltman
Qin Ma
RA Irizarry
RL Poole
S Barkow
S Hochreiter
SC Madeira
T Barrett
T Werner
TL Bailey
WM Liu
X Liu
Y Cheng
Ying Xu
Publication venue: Public Library of Science
Publication date: 05/03/2012
Field of study

BACKGROUND: Biclustering is a powerful technique for identification of co-expressed gene groups under any (unspecified) substantial subset of given experimental conditions, which can be used for elucidation of transcriptionally co-regulated genes. RESULTS: We have previously developed a biclustering algorithm, QUBIC, which can solve more general biclustering problems than previous biclustering algorithms. To fully utilize the analysis power the algorithm provides, we have developed a web server, QServer, for prediction, computational validation and analyses of co-expressed gene clusters. Specifically, the QServer has the following capabilities in addition to biclustering by QUBIC: (i) prediction and assessment of conserved cis regulatory motifs in promoter sequences of the predicted co-expressed genes; (ii) functional enrichment analyses of the predicted co-expressed gene clusters using Gene Ontology (GO) terms, and (iii) visualization capabilities in support of interactive biclustering analyses. QServer supports the biclustering and functional analysis for a wide range of organisms, including human, mouse, Arabidopsis, bacteria and archaea, whose underlying genome database will be continuously updated. CONCLUSION: We believe that QServer provides an easy-to-use and highly effective platform useful for hypothesis formulation and testing related to transcription co-regulation

Public Library of Science (PLOS)

Crossref

FigShare

A bi-ordering approach to linking gene expression with clinical annotations in gastric cancer

Author: Boussioutas Alex
Haviv Izhak
Kowalczyk Adam
Leckie Christopher
MacIntyre Geoff
Shi Fan
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background In the study of cancer genomics, gene expression microarrays, which measure thousands of genes in a single assay, provide abundant information for the investigation of interesting genes or biological pathways. However, in order to analyze the large number of noisy measurements in microarrays, effective and efficient bioinformatics techniques are needed to identify the associations between genes and relevant phenotypes. Moreover, systematic tests are needed to validate the statistical and biological significance of those discoveries. Results In this paper, we develop a robust and efficient method for exploratory analysis of microarray data, which produces a number of different orderings (rankings) of both genes and samples (reflecting correlation among those genes and samples). The core algorithm is closely related to biclustering, and so we first compare its performance with several existing biclustering algorithms on two real datasets - gastric cancer and lymphoma datasets. We then show on the gastric cancer data that the sample orderings generated by our method are highly statistically significant with respect to the histological classification of samples by using the Jonckheere trend test, while the gene modules are biologically significant with respect to biological processes (from the Gene Ontology). In particular, some of the gene modules associated with biclusters are closely linked to gastric cancer tumorigenesis reported in previous literature, while others are potentially novel discoveries. Conclusion In conclusion, we have developed an effective and efficient method, Bi-Ordering Analysis, to detect informative patterns in gene expression microarrays by ranking genes and samples. In addition, a number of evaluation metrics were applied to assess both the statistical and biological significance of the resulting bi-orderings. The methodology was validated on gastric cancer and lymphoma datasets.</p