Search CORE

53,433 research outputs found

A temporal precedence based clustering method for gene expression microarray data

Author: Buchanan-Wollaston Vicky
Krishna Ritesh V.
Li Chang-Tsun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background: Time-course microarray experiments can produce useful data which can help in understanding the underlying dynamics of the system. Clustering is an important stage in microarray data analysis where the data is grouped together according to certain characteristics. The majority of clustering techniques are based on distance or visual similarity measures which may not be suitable for clustering of temporal microarray data where the sequential nature of time is important. We present a Granger causality based technique to cluster temporal microarray gene expression data, which measures the interdependence between two time-series by statistically testing if one time-series can be used for forecasting the other time-series or not. Results: A gene-association matrix is constructed by testing temporal relationships between pairs of genes using the Granger causality test. The association matrix is further analyzed using a graph-theoretic technique to detect highly connected components representing interesting biological modules. We test our approach on synthesized datasets and real biological datasets obtained for Arabidopsis thaliana. We show the effectiveness of our approach by analyzing the results using the existing biological literature. We also report interesting structural properties of the association network commonly desired in any biological system. Conclusions: Our experiments on synthesized and real microarray datasets show that our approach produces encouraging results. The method is simple in implementation and is statistically traceable at each step. The method can produce sets of functionally related genes which can be further used for reverse-engineering of gene circuits

Deakin Research Online

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

A Platform for Processing Expression of Short Time Series (PESTS)

Author: A Conesa
A Conesa
A Schliep
A Sinha
AB Tchagang
Anshu Sinha
BDi Camillo
CE Bonferroni
CM Ribeiro
Dixon J Wilfrid
F Hong
IG Costa
J Ernst
J Ernst
J Leek
J Wang
JD Storey
JSSBueno Filho
M Ramoni
Marianthi Markatou
MB Eisen
MF Ramoni
NAC Cressie
P Tamayo
PJ Rousseeuw
S Peddada
SA Ghandhi
T Park
T Schweder
V Tusher
Y Benjamini
Z Bar-Joseph
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Time course microarray profiles examine the expression of genes over a time domain. They are necessary in order to determine the complete set of genes that are dynamically expressed under given conditions, and to determine the interaction between these genes. Because of cost and resource issues, most time series datasets contain less than 9 points and there are few tools available geared towards the analysis of this type of data. Results To this end, we introduce a platform for Processing Expression of Short Time Series (PESTS). It was designed with a focus on usability and interpretability of analyses for the researcher. As such, it implements several standard techniques for comparability as well as visualization functions. However, it is designed specifically for the unique methods we have developed for significance analysis, multiple test correction and clustering of short time series data. The central tenet of these methods is the use of biologically relevant features for analysis. Features summarize short gene expression profiles, inherently incorporate dependence across time, and allow for both full description of the examined curve and missing data points. Conclusions PESTS is fully generalizable to other types of time series analyses. PESTS implements novel methods as well as several standard techniques for comparability and visualization functions. These features and functionality make PESTS a valuable resource for a researcher's toolkit. PESTS is available to download for free to academic and non-profit users at <url>http://www.mailman.columbia.edu/academic-departments/biostatistics/research-service/software-development</url>.</p

Crossref

Directory of Open Access Journals

PubMed Central

Animated interval scatter-plot views for the exploratory analysis of large scale microarray time-course data.

Author: Craig Paul
Cumming Andrew
Kennedy Jessie
Publication venue: SAGE Publications
Publication date: 01/01/2005
Field of study

Microarray technologies are a relatively new development that allow biologists to monitor the activity of thousands of genes (normally around 8,000) in parallel across multiple stages of a biological process. While this new perspective on biological functioning is recognised as having the potential to have a significant impact on the diagnosis, treatment, and prevention of diseases, it is only through effective analysis of the data produced that biologists can begin to unlock this potential. A significant obstacle to achieving effective analysis of microarray time-course is the combined scale and complexity of the data. This inevitably makes it difficult to reveal certain significant patterns in the data. In particular, it is less dominant patterns and, specifically, patterns that occur over smaller intervals of an experiment's overall time-frame that are more difficult to find. While existing techniques are capable of finding either unexpected patterns of activity over the majority of an experiment's time-frame or expected patterns of activity over smaller intervals of the time-frame, there are no techniques, or combination of techniques, that are suitable for finding unsuspected patterns of activity over smaller intervals. In order to overcome this limitation we have developed the Time-series Explorer, which specifically supports biologists in their attempts to reveal these types of pattern by allowing them to control an animated interval scatter-plot view of their data. This paper discusses aspects of the technique that make such an animated overview viable and describes the results of a user evaluation assessing the practical utility of the technique within the wider context of microarray time-series analysis as a whole

Repository@Napier

Dynamic Analysis of High Dimensional Microarray Time Series Data Using Various Dimensional Reduction Methods

Author: Aolhasani Samareh Banafsheh
Publication venue: 'Oklahoma State University Library'
Publication date: 01/12/2013
Field of study

This dissertation focuses on dynamic analysis of reduced dimension models of two microarray time series datasets. Underlying research achieves two main objectives; namely, (1) various dimension reduction techniques used on time series microarray data, and (2) estimating autoregressive coefficients using several penalized regression methods like ridge, SCAD, and lasso.The research methodology includes two research tasks. Firstly, applying several dimension reduction methods on two microarray data sets, and modeling comparisons based on accuracy and computation cost. Secondly, applying the sparse vector autoregressive (SVAR) model to estimate gene regulatory network based on gene expression profile from time series microarray experiment on two datasets and the autoregressive coefficients estimation were calculated using several penalized regression methods, and then performing comparisons among various regression methods for each dimension reduction model.Study results show that the dimension reduction methods producing orthogonal independent variables are performing better because orthogonality leads to reasonable coefficient estimation with low standard errors. On the other hand, regarding dynamic analysis, it could be seen that factor analysis (FA) outperformed the rest of dimension reduction methods with regards to goodness of fit after applying several penalized regression methods on each model. The reason behind this is due to using varimax rotation in FA, in which most of the coordinates are set closer to zero, and in turn makes the data sparser. Hence inducing additional sparsity subject to maintaining a certain goodness of fit.Industrial Engineering & Managemen

SHAREOK repository

Recommended from our members

Microarray image processing: A novel neural network framework

Author: Zineddin Bachar
Publication venue: Brunel University, School of Information Systems, Computing and Mathematics
Publication date: 01/01/2011
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Due to the vast success of bioengineering techniques, a series of large-scale analysis tools has been developed to discover the functional organization of cells. Among them, cDNA microarray has emerged as a powerful technology that enables biologists to cDNA microarray technology has enabled biologists to study thousands of genes simultaneously within an entire organism, and thus obtain a better understanding of the gene interaction and regulation mechanisms involved. Although microarray technology has been developed so as to offer high tolerances, there exists high signal irregularity through the surface of the microarray image. The imperfection in the microarray image generation process causes noises of many types, which contaminate the resulting image. These errors and noises will propagate down through, and can significantly affect, all subsequent processing and analysis. Therefore, to realize the potential of such technology it is crucial to obtain high quality image data that would indeed reflect the underlying biology in the samples. One of the key steps in extracting information from a microarray image is segmentation: identifying which pixels within an image represent which gene. This area of spotted microarray image analysis has received relatively little attention relative to the advances in proceeding analysis stages. But, the lack of advanced image analysis, including the segmentation, results in sub-optimal data being used in all downstream analysis methods. Although there is recently much research on microarray image analysis with many methods have been proposed, some methods produce better results than others. In general, the most effective approaches require considerable run time (processing) power to process an entire image. Furthermore, there has been little progress on developing sufficiently fast yet efficient and effective algorithms the segmentation of the microarray image by using a highly sophisticated framework such as Cellular Neural Networks (CNNs). It is, therefore, the aim of this thesis to investigate and develop novel methods processing microarray images. The goal is to produce results that outperform the currently available approaches in terms of PSNR, k-means and ICC measurements.Aleppo University, Syri

Brunel University Research Archive

RNA-seq vs dual- and single-channel microarray data: sensitivity analysis for differential expression and clustering

Author: Crane Martin
Kerr Gráinne
Ruskin Heather J.
Sîrbu Alina
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

With the fast development of high-throughput sequencing technologies, a new generation of genome-wide gene expression measurements is under way. This is based on mRNA sequencing (RNA-seq), which complements the already mature technology of microarrays, and is expected to overcome some of the latter’s disadvantages. These RNA-seq data pose new challenges, however, as strengths and weaknesses have yet to be fully identified. Ideally, Next (or Second) Generation Sequencing measures can be integrated for more comprehensive gene expression investigation to facilitate analysis of whole regulatory networks. At present, however, the nature of these data is not very well understood. In this paper we study three alternative gene expression time series datasets for the Drosophila melanogaster embryo development, in order to compare three measurement techniques: RNA-seq, single-channel and dual-channel microarrays. The aim is to study the state of the art for the three technologies, with a view of assessing overlapping features, data compatibility and integration potential, in the context of time series measurements. This involves using established tools for each of the three different technologies, and technical and biological replicates (for RNA-seq and microarrays, respectively), due to the limited availability of biological RNA-seq replicates for time series data. The approach consists of a sensitivity analysis for differential expression and clustering. In general, the RNA-seq dataset displayed highest sensitivity to differential expression. The single-channel data performed similarly for the differentially expressed genes common to gene sets considered. Cluster analysis was used to identify different features of the gene space for the three datasets, with higher similarities found for the RNA-seq and single-channel microarray dataset

Crossref

Directory of Open Access Journals

Irish Universities

Archivio della Ricerca - Università di Pisa

PubMed Central

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

DCU Online Research Access Service

FigShare

Recommended from our members

Clustering Algorithms for Time Series Gene Expression in Microarray Data

Author: Zhang Guilin
Publication venue: 'University of North Texas Libraries'
Publication date: 01/08/2012
Field of study

Clustering techniques are important for gene expression data analysis. However, efficient computational algorithms for clustering time-series data are still lacking. This work documents two improvements on an existing profile-based greedy algorithm for short time-series data; the first one is implementation of a scaling method on the pre-processing of the raw data to handle some extreme cases; the second improvement is modifying the strategy to generate better clusters. Simulation data and real microarray data were used to evaluate these improvements; this approach could efficiently generate more accurate clusters. A new feature-based algorithm was also developed in which steady state value; overshoot, rise time, settling time and peak time are generated by the 2nd order control system for the clustering purpose. This feature-based approach is much faster and more accurate than the existing profile-based algorithm for long time-series data

UNT Digital Library

Bayesian meta-analysis for identifying periodically expressed genes in fission yeast cell cycle

Author: Fan Xiaodan
Liu Jun S.
Pyne Saumyadipta
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 09/11/2010
Field of study

The effort to identify genes with periodic expression during the cell cycle from genome-wide microarray time series data has been ongoing for a decade. However, the lack of rigorous modeling of periodic expression as well as the lack of a comprehensive model for integrating information across genes and experiments has impaired the effort for the accurate identification of periodically expressed genes. To address the problem, we introduce a Bayesian model to integrate multiple independent microarray data sets from three recent genome-wide cell cycle studies on fission yeast. A hierarchical model was used for data integration. In order to facilitate an efficient Monte Carlo sampling from the joint posterior distribution, we develop a novel Metropolis--Hastings group move. A surprising finding from our integrated analysis is that more than 40% of the genes in fission yeast are significantly periodically expressed, greatly enhancing the reported 10--15% of the genes in the current literature. It calls for a reconsideration of the periodically expressed gene detection problem.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS300 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Time-series Explorer: An Animated Information Visualisation for Microarray Time-course Data

Author: Craig Paul
Cumming Andrew
Kennedy Jessie
Publication venue: BMC
Publication date: 01/07/2005
Field of study

Microarray technologies are a relatively new development that allow biologists to monitor the activity of thousands of genes (normally around 8,000) in parallel across multiple stages of a biological process. While this new perspective on biological functioning is recognised as having the potential to have a significant impact on the diagnosis, treatment, and prevention of diseases, it is only through effective analysis of the data produced that biologists can begin to unlock this potential. A significant obstacle to achieving effective analysis of microarray time-course is the combined scale and complexity of the data. This inevitably makes it difficult to reveal certain significant patterns in the data. In particular it is less dominant patterns and, specifically, patterns that occur over smaller intervals of an experiment's overall time-frame that are more difficult to find. While existing techniques are capable of finding either unexpected patterns of activity over the majority of an experiment's time frame or expected patterns of activity over smaller intervals of the time frame, there are no techniques, or combination of techniques, that are suitable for finding unsuspected patterns of activity over smaller intervals. In order to overcome this limitation we have developed the Time-series Explorer, which specifically supports biologists in their attempts to reveal these types of pattern by allowing them to visualise their data controlling an animated interval scatter-plot linked to two complementary graph views. An evaluation, involving biologists working with real data, tested the extent of the tools desired functionality and assessed the technique's practical utility within the wider context of microarray time-course analysis. This proved the technique not only capable of revealing previously unsuspected temporal patterns but also, in certain cases, more appropriate for finding previously suspected patterns and patterns that occurred over the majority of the time-frame

Springer - Publisher Connector

Repository@Napier

Towards knowledge-based gene expression data mining

Author: Bellazzi Riccado
Zupan Blaz
Publication venue
Publication date: 01/01/2007
Field of study

The field of gene expression data analysis has grown in the past few years from being purely data-centric to integrative, aiming at complementing microarray analysis with data and knowledge from diverse available sources. In this review, we report on the plethora of gene expression data mining techniques and focus on their evolution toward knowledge-based data analysis approaches. In particular, we discuss recent developments in gene expression-based analysis methods used in association and classification studies, phenotyping and reverse engineering of gene networks

Elsevier - Publisher Connector

ePrints.FRI