Search CORE

37 research outputs found

Techniques for clustering gene expression data

Author: Crane Martin
Doolan Padraig
Kerr Gráinne
Ruskin Heather J.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered

CiteSeerX

Irish Universities

DCU Online Research Access Service

Edge-weighting of gene expression graphs

Author: Bourqui R.
Bryan K.
Carlson M. R.
Chebyshev P. L.
Di Gesu V.
Shamir R.
Tanay A.
Zhang B.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2009
Field of study

In recent years, considerable research efforts have been directed to micro-array technologies and their role in providing simultaneous information on expression profiles for thousands of genes. These data, when subjected to clustering and classification procedures, can assist in identifying patterns and providing insight on biological processes. To understand the properties of complex gene expression datasets, graphical representations can be used. Intuitively, the data can be represented in terms of a bipartite graph, with weighted edges corresponding to gene-sample node couples in the dataset. Biologically meaningful subgraphs can be sought, but performance can be influenced both by the search algorithm, and, by the graph-weighting scheme and both merit rigorous investigation. In this paper, we focus on edge-weighting schemes for bipartite graphical representation of gene expression. Two novel methods are presented: the first is based on empirical evidence; the second on a geometric distribution. The schemes are compared for several real datasets, assessing efficiency of performance based on four essential properties: robustness to noise and missing values, discrimination, parameter influence on scheme efficiency and reusability. Recommendations and limitations are briefly discussed

Crossref

Irish Universities

Queensland University of Technology ePrints Archive

DCU Online Research Access Service

Identifying functional relationships within sets of co-expressed genes by combining upstream regulatory motif analysis and gene expression information

Author: Gross Robert H
Martyanov Viktor
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Existing clustering approaches for microarray data do not adequately differentiate between subsets of co-expressed genes. We devised a novel approach that integrates expression and sequence data in order to generate functionally coherent and biologically meaningful subclusters of genes. Specifically, the approach clusters co-expressed genes on the basis of similar content and distributions of predicted statistically significant sequence motifs in their upstream regions

Crossref

Springer - Publisher Connector

PubMed Central

Dartmouth Digital Commons (Dartmouth College)

RNA-seq vs dual- and single-channel microarray data: sensitivity analysis for differential expression and clustering

Author: Crane Martin
Kerr Gráinne
Ruskin Heather J.
Sîrbu Alina
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

With the fast development of high-throughput sequencing technologies, a new generation of genome-wide gene expression measurements is under way. This is based on mRNA sequencing (RNA-seq), which complements the already mature technology of microarrays, and is expected to overcome some of the latter’s disadvantages. These RNA-seq data pose new challenges, however, as strengths and weaknesses have yet to be fully identified. Ideally, Next (or Second) Generation Sequencing measures can be integrated for more comprehensive gene expression investigation to facilitate analysis of whole regulatory networks. At present, however, the nature of these data is not very well understood. In this paper we study three alternative gene expression time series datasets for the Drosophila melanogaster embryo development, in order to compare three measurement techniques: RNA-seq, single-channel and dual-channel microarrays. The aim is to study the state of the art for the three technologies, with a view of assessing overlapping features, data compatibility and integration potential, in the context of time series measurements. This involves using established tools for each of the three different technologies, and technical and biological replicates (for RNA-seq and microarrays, respectively), due to the limited availability of biological RNA-seq replicates for time series data. The approach consists of a sensitivity analysis for differential expression and clustering. In general, the RNA-seq dataset displayed highest sensitivity to differential expression. The single-channel data performed similarly for the differentially expressed genes common to gene sets considered. Cluster analysis was used to identify different features of the gene space for the three datasets, with higher similarities found for the RNA-seq and single-channel microarray dataset

Crossref

Directory of Open Access Journals

Irish Universities

Archivio della Ricerca - Università di Pisa

PubMed Central

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

DCU Online Research Access Service

FigShare

Making Informed Choices about Microarray Data Analysis

Author: A Kauffmann
A Ploner
A Reiner
AC Culhane
AC Eklund
BE Stranger
BM Bolstad
BP Durbin
BP Durbin
C Li
CM Perou
DK Slonim
F Bretz
Fran Lewitter
G Kerr
GA Churchill
GK Smyth
GK Smyth
GP Page
HM Kang
I Lonnstedt
J Hou
J Leek
JC Marioni
JD Storey
JD Storey
JF Ayroles
JH Do
KV Mardia
LM Cope
M Dai
M Reimers
M Reimers
M Reimers
M Reimers
M Suarez-Farinas
Mark Reimers
MC Ryan
ME Figueroa
ME Ritchie
NR Garge
R Gentleman
RA Irizarry
RA Irizarry
RA Johnson
S Dudoit
T Hastie
T Hastie
TL Fare
W Huber
WE Johnson
WK Lim
WS Branham
X Cui
Y Benjamini
YH Yang
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

This article describes the typical stages in the analysis of microarray data for non-specialist researchers in systems biology and medicine. Particular attention is paid to significant data analysis issues that are commonly encountered among practitioners, some of which need wider airing. The issues addressed include experimental design, quality assessment, normalization, and summarization of multiple-probe data. This article is based on the ISMB 2008 tutorial on microarray data analysis. An expanded version of the material in this article and the slides from the tutorial can be found at http://www.people.vcu.edu/~mreimers/OGMDA/index.html

Crossref

Directory of Open Access Journals

PubMed Central

VCU Scholars Compass