12,665 research outputs found
Extracting biologically significant patterns from short time series gene expression data
<p>Abstract</p> <p>Background</p> <p>Time series gene expression data analysis is used widely to study the dynamics of various cell processes. Most of the time series data available today consist of few time points only, thus making the application of standard clustering techniques difficult.</p> <p>Results</p> <p>We developed two new algorithms that are capable of extracting biological patterns from short time point series gene expression data. The two algorithms, <it>ASTRO </it>and <it>MiMeSR</it>, are inspired by the <it>rank order preserving </it>framework and the <it>minimum mean squared residue </it>approach, respectively. However, <it>ASTRO </it>and <it>MiMeSR </it>differ from previous approaches in that they take advantage of the relatively few number of time points in order to reduce the problem from NP-hard to linear. Tested on well-defined short time expression data, we found that our approaches are robust to noise, as well as to random patterns, and that they can correctly detect the temporal expression profile of relevant functional categories. Evaluation of our methods was performed using Gene Ontology (GO) annotations and chromatin immunoprecipitation (ChIP-chip) data.</p> <p>Conclusion</p> <p>Our approaches generally outperform both standard clustering algorithms and algorithms designed specifically for clustering of short time series gene expression data. Both algorithms are available at <url>http://www.benoslab.pitt.edu/astro/</url>.</p
Short time-series microarray analysis: Methods and challenges
The detection and analysis of steady-state gene expression has become routine. Time-series microarrays are of growing interest to systems biologists for deciphering the dynamic nature and complex regulation of biosystems. Most temporal microarray data only contain a limited number of time points, giving rise to short-time-series data, which imposes challenges for traditional methods of extracting meaningful information. To obtain useful information from the wealth of short-time series data requires addressing the problems that arise due to limited sampling. Current efforts have shown promise in improving the analysis of short time-series microarray data, although challenges remain. This commentary addresses recent advances in methods for short-time series analysis including simplification-based approaches and the integration of multi-source information. Nevertheless, further studies and development of computational methods are needed to provide practical solutions to fully exploit the potential of this data
Transcription factor target prediction using multiple short expression time series from Arabidopsis thaliana
BACKGROUND: The central role of transcription factors (TFs) in higher eukaryotes has led to much interest in deciphering transcriptional regulatory interactions. Even in the best case, experimental identification of TF target genes is error prone, and has been shown to be improved by considering additional forms of evidence such as expression data. Previous expression based methods have not explicitly tried to associate TFs with their targets and therefore largely ignored the treatment specific and time dependent nature of transcription regulation. RESULTS: In this study we introduce CERMT, Covariance based Extraction of Regulatory targets using Multiple Time series. Using simulated and real data we show that using multiple expression time series, selecting treatments in which the TF responds, allowing time shifts between TFs and their targets and using covariance to identify highly responding genes appear to be a good strategy. We applied our method to published TF - target gene relationships determined using expression profiling on TF mutants and show that in most cases we obtain significant target gene enrichment and in half of the cases this is sufficient to deliver a usable list of high-confidence target genes. CONCLUSION: CERMT could be immediately useful in refining possible target genes of candidate TFs using publicly available data, particularly for organisms lacking comprehensive TF binding data. In the future, we believe its incorporation with other forms of evidence may improve integrative genome-wide predictions of transcriptional networks
Coupled Two-Way Clustering Analysis of Gene Microarray Data
We present a novel coupled two-way clustering approach to gene microarray
data analysis. The main idea is to identify subsets of the genes and samples,
such that when one of these is used to cluster the other, stable and
significant partitions emerge. The search for such subsets is a computationally
complex task: we present an algorithm, based on iterative clustering, which
performs such a search. This analysis is especially suitable for gene
microarray data, where the contributions of a variety of biological mechanisms
to the gene expression levels are entangled in a large body of experimental
data. The method was applied to two gene microarray data sets, on colon cancer
and leukemia. By identifying relevant subsets of the data and focusing on them
we were able to discover partitions and correlations that were masked and
hidden when the full dataset was used in the analysis. Some of these partitions
have clear biological interpretation; others can serve to identify possible
directions for future research
Positional information, positional error, and read-out precision in morphogenesis: a mathematical framework
The concept of positional information is central to our understanding of how
cells in a multicellular structure determine their developmental fates.
Nevertheless, positional information has neither been defined mathematically
nor quantified in a principled way. Here we provide an information-theoretic
definition in the context of developmental gene expression patterns and examine
which features of expression patterns increase or decrease positional
information. We connect positional information with the concept of positional
error and develop tools to directly measure information and error from
experimental data. We illustrate our framework for the case of gap gene
expression patterns in the early Drosophila embryo and show how information
that is distributed among only four genes is sufficient to determine
developmental fates with single cell resolution. Our approach can be
generalized to a variety of different model systems; procedures and examples
are discussed in detail
Altered gene expression and DNA damage in peripheral blood cells from Friedreich's ataxia patients: Cellular model of pathology
The neurodegenerative disease Friedreich's ataxia (FRDA) is the most common autosomal-recessively inherited ataxia and is caused by a GAA triplet repeat expansion in the first intron of the frataxin gene. In this disease, transcription of frataxin, a mitochondrial protein involved in iron homeostasis, is impaired, resulting in a significant reduction in mRNA and protein levels. Global gene expression analysis was performed in peripheral blood samples from FRDA patients as compared to controls, which suggested altered expression patterns pertaining to genotoxic stress. We then confirmed the presence of genotoxic DNA damage by using a gene-specific quantitative PCR assay and discovered an increase in both mitochondrial and nuclear DNA damage in the blood of these patients (p<0.0001, respectively). Additionally, frataxin mRNA levels correlated with age of onset of disease and displayed unique sets of gene alterations involved in immune response, oxidative phosphorylation, and protein synthesis. Many of the key pathways observed by transcription profiling were downregulated, and we believe these data suggest that patients with prolonged frataxin deficiency undergo a systemic survival response to chronic genotoxic stress and consequent DNA damage detectable in blood. In conclusion, our results yield insight into the nature and progression of FRDA, as well as possible therapeutic approaches. Furthermore, the identification of potential biomarkers, including the DNA damage found in peripheral blood, may have predictive value in future clinical trials
Experimental and computational applications of microarray technology for malaria eradication in Africa
Various mutation assisted drug resistance evolved in Plasmodium falciparum strains and insecticide
resistance to female Anopheles mosquito account for major biomedical catastrophes standing against
all efforts to eradicate malaria in Sub-Saharan Africa. Malaria is endemic in more than 100 countries and
by far the most costly disease in terms of human health causing major losses among many African
nations including Nigeria. The fight against malaria is failing and DNA microarray analysis need to keep
up the pace in order to unravel the evolving parasite’s gene expression profile which is a pointer to
monitoring the genes involved in malaria’s infective metabolic pathway. Huge data is generated and
biologists have the challenge of extracting useful information from volumes of microarray data.
Expression levels for tens of thousands of genes can be simultaneously measured in a single
hybridization experiment and are collectively called a “gene expression profile”. Gene expression
profiles can also be used in studying various state of malaria development in which expression profiles
of different disease states at different time points are collected and compared to each other to establish
a classifying scheme for purposes such as diagnosis and treatments with adequate drugs. This paper
examines microarray technology and its application as supported by appropriate software tools from
experimental set-up to the level of data analysis. An assessment of the level of microarray technology
in Africa, its availability and techniques required for malaria eradication and effective healthcare in
Nigeria and Africa in general were also underscored
- …