Search CORE

3 research outputs found

UNCLES: Method for the identification of genes differentially consistently co-expressed in a specific subset of datasets

Author: A Huber
A Prelić
AA Shabalin
AP Gasch
Asoke K. Nandi
B Abu-Jamous
B Abu-Jamous
Basel Abu-Jamous
C Koch
CH Wade
CT Harbison
D Dikicioglu
D Liu
DA Orlando
David J. Roberts
IS Dhillon
J Bahler
J Yang
JK Choi
JK Limb
JM Pena
JM Stuart
KC Li
KC Li
KY Yeung
KY Yeung
L Lazzeroni
LP Zhao
MB Eisen
P Cahan
P Grandi
PC Roberts
PT Spellman
R Fa
R Lletı́a
R Nilsson
RJ Cho
RM Piro
Rui Fa
S Chu
S Fujii
S Sharma
S Vega-Pons
T Hayata
T Murali
T Pramila
TC Fleischer
VA Gennarino
X Liu
Y Cheng
Y Kluger
Z Tao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/06/2015
Field of study

Background: Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Results: Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. Conclusions: The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.The National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (Grant Reference Number RP-PG-0310-1004)

Jyväskylä University Digital Archive

Crossref

Springer - Publisher Connector

PubMed Central

Brunel University Research Archive

Relating gene expression data on two-component systems to functional annotations in Escherichia coli

Abstract Background Obtaining physiological insights from microarray experiments requires computational techniques that relate gene expression data to functional information. Traditionally, this has been done in two consecutive steps. The first step identifies important genes through clustering or statistical techniques, while the second step assigns biological functions to the identified groups. Recently, techniques have been developed that identify such relationships in a single step. Results We have developed an algorithm that relates patterns of gene expression in a set of microarray experiments to functional groups in one step. Our only assumption is that patterns co-occur frequently. The effectiveness of the algorithm is demonstrated as part of a study of regulation by two-component systems in <it>Escherichia coli</it>. The significance of the relationships between expression data and functional annotations is evaluated based on density histograms that are constructed using product similarity among expression vectors. We present a biological analysis of three of the resulting functional groups of proteins, develop hypotheses for further biological studies, and test one of these hypotheses experimentally. A comparison with other algorithms and a different data set is presented. Conclusion Our new algorithm is able to find interesting and biologically meaningful relationships, not found by other algorithms, in previously analyzed data sets. Scaling of the algorithm to large data sets can be achieved based on a theoretical model.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

BicPAM: Pattern-based biclustering for biomedical data analysis

Author: A Ben-Dor
A Donders
A Patrikainen
A Prelić
A Rosenwald
A Serin
A Tanay
AP Bracken
AP Gasch
AP Lee
B Pontes
C Ding
C Tang
D Bozdağ
D Burdick
D Martin
D Wlodkowic
D Xin
F Pan
F Pan
G Bebek
G Getz
G Pandey
G Ramesh
GF Berriz
H Wang
J Bellay
J Han
J Han
J Ihmels
J Liu
J Munkres
JA Hartigan
K Sequeira
L Lazzeroni
L Zhang
M de Souto
M Mahfouz
M Safran
M Steinbach
MJ Zaki
MJ Zaki
MT Doolin
N Pasquier
O Odibat
O Troyanskaya
P Carmona-Saez
R Agrawal
R Agrawal
R Alves
R Gupta
R Henriques
R Henriques
R Henriques
R Martinez
Rui Henriques
S Barkow
S Busygin
S Hochreiter
S Madeira
Sara C Madeira
SC Madeira
T Hellem
T Uno
W Lee
X Yan
Y Huang
Y Nakagawa
Y Okada
Y Okada
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref