233 research outputs found

    UNCLES: Method for the identification of genes differentially consistently co-expressed in a specific subset of datasets

    Get PDF
    Background: Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Results: Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. Conclusions: The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies.The National Institute for Health Research (NIHR) under its Programme Grants for Applied Research Programme (Grant Reference Number RP-PG-0310-1004)

    Oscillatory Dynamics of Cell Cycle Proteins in Single Yeast Cells Analyzed by Imaging Cytometry

    Get PDF
    Progression through the cell division cycle is orchestrated by a complex network of interacting genes and proteins. Some of these proteins are known to fluctuate periodically during the cell cycle, but a systematic study of the fluctuations of a broad sample of cell-cycle proteins has not been made until now. Using time-lapse fluorescence microscopy, we profiled 16 strains of budding yeast, each containing GFP fused to a single gene involved in cell cycle regulation. The dynamics of protein abundance and localization were characterized by extracting the amplitude, period, and other indicators from a series of images. Oscillations of protein abundance could clearly be identified for Cdc15, Clb2, Cln1, Cln2, Mcm1, Net1, Sic1, and Whi5. The period of oscillation of the fluorescently tagged proteins is generally in good agreement with the inter-bud time. The very strong oscillations of Net1 and Mcm1 expression are remarkable since little is known about the temporal expression of these genes. By collecting data from large samples of single cells, we quantified some aspects of cell-to-cell variability due presumably to intrinsic and extrinsic noise affecting the cell cycle

    Paradigm of tunable clustering using binarization of consensus partition matrices (Bi-CoPaM) for gene discovery

    Get PDF
    Copyright @ 2013 Abu-Jamous et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Clustering analysis has a growing role in the study of co-expressed genes for gene discovery. Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters. Also, these algorithms cannot generate tight clusters that focus on their cores or wide clusters that overlap and contain all possibly relevant genes. In this paper, a new clustering paradigm is proposed. In this paradigm, all three eventualities of a gene being exclusively assigned to a single cluster, being assigned to multiple clusters, and being not assigned to any cluster are possible. These possibilities are realised through the primary novelty of the introduction of tunable binarization techniques. Results from multiple clustering experiments are aggregated to generate one fuzzy consensus partition matrix (CoPaM), which is then binarized to obtain the final binary partitions. This is referred to as Binarization of Consensus Partition Matrices (Bi-CoPaM). The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets. The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies.National Institute for Health Researc

    HeatMapper: powerful combined visualization of gene expression profile correlations, genotypes, phenotypes and sample characteristics

    Get PDF
    BACKGROUND: Accurate interpretation of data obtained by unsupervised analysis of large scale expression profiling studies is currently frequently performed by visually combining sample-gene heatmaps and sample characteristics. This method is not optimal for comparing individual samples or groups of samples. Here, we describe an approach to visually integrate the results of unsupervised and supervised cluster analysis using a correlation plot and additional sample metadata. RESULTS: We have developed a tool called the HeatMapper that provides such visualizations in a dynamic and flexible manner and is available from . CONCLUSION: The HeatMapper allows an accessible and comprehensive visualization of the results of gene expression profiling and cluster analysis

    Coordinated Expression Domains in Mammalian Genomes

    Get PDF
    Gene order in eukaryotic genomes is not random. Genes showing similar expression (coexpression) patterns are often clustered along the genome. The goal of this study is to characterize coexpression clustering in mammalian genomes and to investigate the underlying mechanisms.We detect clustering of coexpressed genes across multiple scales, from neighboring genes to chromosomal domains that span tens of megabases and, in some cases, entire chromosomes. Coexpression domains may be positively or negatively correlated with other domains, within and between chromosomes. We find that long-range expression domains are associated with gene density, which in turn is related to physical organization of the chromosomes within the nucleus. We show that gene expression changes between healthy and diseased tissue samples occur in a gene density-dependent manner.We demonstrate that coexpression domains exist across multiple scales. We identify potential mechanisms for short-range as well as long-range coexpression domains. We provide evidence that the three-dimensional architecture of the chromosomes may underlie long-range coexpression domains. Chromosome territory reorganization may play a role in common human diseases such as Alzheimer's disease and psoriasis

    Expression of Human nPTB Is Limited by Extreme Suboptimal Codon Content

    Get PDF
    Background: The frequency of synonymous codon usage varies widely between organisms. Suboptimal codon content limits expression of viral, experimental or therapeutic heterologous proteins due to limiting cognate tRNAs. Codon content is therefore often adjusted to match codon bias of the host organism. Codon content also varies between genes within individual mammalian species. However, little attention has been paid to the consequences of codon content upon translation of host proteins. Methodology/Principal Findings: In comparing the splicing repressor activities of transfected human PTB and its two tissue-restricted paralogs–nPTB and ROD1–we found that the three proteins were expressed at widely varying levels. nPTB was expressed at 1–3 % the level of PTB despite similar levels of mRNA expression and 74 % amino acid identity. The low nPTB expression was due to the high proportion of codons with A or U at the third codon position, which are suboptimal in human mRNAs. Optimization of the nPTB codon content, akin to the ‘‘humanization’ ’ of foreign ORFs, allowed efficient translation in vivo and in vitro to levels comparable with PTB. We were then able to demonstrate that all three proteins act as splicing repressors. Conclusions/Significance: Our results provide a striking illustration of the importance of mRNA codon content in determining levels of protein expression, even within cells of the natural host species

    Time warping of evolutionary distant temporal gene expression data based on noise suppression

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Comparative analysis of genome wide temporal gene expression data has a broad potential area of application, including evolutionary biology, developmental biology, and medicine. However, at large evolutionary distances, the construction of global alignments and the consequent comparison of the time-series data are difficult. The main reason is the accumulation of variability in expression profiles of orthologous genes, in the course of evolution.</p> <p>Results</p> <p>We applied Pearson distance matrices, in combination with other noise-suppression techniques and data filtering to improve alignments. This novel framework enhanced the capacity to capture the similarities between the temporal gene expression datasets separated by large evolutionary distances. We aligned and compared the temporal gene expression data in budding (<it>Saccharomyces cerevisiae</it>) and fission (<it>Schizosaccharomyces pombe</it>) yeast, which are separated by more then ~400 myr of evolution. We found that the global alignment (time warping) properly matched the duration of cell cycle phases in these distant organisms, which was measured in prior studies. At the same time, when applied to individual ortholog pairs, this alignment procedure revealed groups of genes with distinct alignments, different from the global alignment.</p> <p>Conclusion</p> <p>Our alignment-based predictions of differences in the cell cycle phases between the two yeast species were in a good agreement with the existing data, thus supporting the computational strategy adopted in this study. We propose that the existence of the alternative alignments, specific to distinct groups of genes, suggests presence of different synchronization modes between the two organisms and possible functional decoupling of particular physiological gene networks in the course of evolution.</p

    Warming Can Boost Denitrification Disproportionately Due to Altered Oxygen Dynamics

    Get PDF
    Background: Global warming and the alteration of the global nitrogen cycle are major anthropogenic threats to the environment. Denitrification, the biological conversion of nitrate to gaseous nitrogen, removes a substantial fraction of the nitrogen from aquatic ecosystems, and can therefore help to reduce eutrophication effects. However, potential responses of denitrification to warming are poorly understood. Although several studies have reported increased denitrification rates with rising temperature, the impact of temperature on denitrification seems to vary widely between systems. Methodology/Principal Findings: We explored the effects of warming on denitrification rates using microcosm experiments, field measurements and a simple model approach. Our results suggest that a three degree temperature rise will double denitrification rates. By performing experiments at fixed oxygen concentrations as well as with oxygen concentrations varying freely with temperature, we demonstrate that this strong temperature dependence of denitrification can be explained by a systematic decrease of oxygen concentrations with rising temperature. Warming decreases oxygen concentrations due to reduced solubility, and more importantly, because respiration rates rise more steeply with temperature than photosynthesis. Conclusions/Significance: Our results show that denitrification rates in aquatic ecosystems are strongly temperature dependent, and that this is amplified by the temperature dependencies of photosynthesis and respiration. Our result
    corecore