147 research outputs found

    Transcription factor target prediction using multiple short expression time series from Arabidopsis thaliana

    Get PDF
    BACKGROUND: The central role of transcription factors (TFs) in higher eukaryotes has led to much interest in deciphering transcriptional regulatory interactions. Even in the best case, experimental identification of TF target genes is error prone, and has been shown to be improved by considering additional forms of evidence such as expression data. Previous expression based methods have not explicitly tried to associate TFs with their targets and therefore largely ignored the treatment specific and time dependent nature of transcription regulation. RESULTS: In this study we introduce CERMT, Covariance based Extraction of Regulatory targets using Multiple Time series. Using simulated and real data we show that using multiple expression time series, selecting treatments in which the TF responds, allowing time shifts between TFs and their targets and using covariance to identify highly responding genes appear to be a good strategy. We applied our method to published TF - target gene relationships determined using expression profiling on TF mutants and show that in most cases we obtain significant target gene enrichment and in half of the cases this is sufficient to deliver a usable list of high-confidence target genes. CONCLUSION: CERMT could be immediately useful in refining possible target genes of candidate TFs using publicly available data, particularly for organisms lacking comprehensive TF binding data. In the future, we believe its incorporation with other forms of evidence may improve integrative genome-wide predictions of transcriptional networks

    Homoeologs: What Are They and How Do We Infer Them?

    Get PDF
    The evolutionary history of nearly all flowering plants includes a polyploidization event. Homologous genes resulting from allopolyploidy are commonly referred to as 'homoeologs', although this term has not always been used precisely or consistently in the literature. With several allopolyploid genome sequencing projects under way, there is a pressing need for computational methods for homoeology inference. Here we review the definition of homoeology in historical and modern contexts and propose a precise and testable definition highlighting the connection between homoeologs and orthologs. In the second part, we survey experimental and computational methods of homoeolog inference, considering the strengths and limitations of each approach. Establishing a precise and evolutionarily meaningful definition of homoeology is essential for understanding the evolutionary consequences of polyploidization

    pcaMethods - a bioconductor package providing PCA methods for incomplete data

    Get PDF
    pcaMethods is a Bioconductor compliant library for computing principal component analysis (PCA) on incomplete data sets. The results can be analyzed directly or used to estimate missing values to enable the use of missing value sensitive statistical methods. The package was mainly developed with microarray and metabolite data sets in mind, but can be applied to any other incomplete data set as well

    Metabolomic correlation-network modules in Arabidopsis based on a graph-clustering approach

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Deciphering the metabolome is essential for a better understanding of the cellular metabolism as a system. Typical metabolomics data show a few but significant correlations among metabolite levels when data sampling is repeated across individuals grown under strictly controlled conditions. Although several studies have assessed topologies in metabolomic correlation networks, it remains unclear whether highly connected metabolites in these networks have specific functions in known tissue- and/or genotype-dependent biochemical pathways.</p> <p>Results</p> <p>In our study of metabolite profiles we subjected root tissues to gas chromatography-time-of-flight/mass spectrometry (GC-TOF/MS) and used published information on the aerial parts of 3 <it>Arabidopsis </it>genotypes, Col-0 wild-type, <it>methionine over-accumulation 1 </it>(<it>mto1</it>), and <it>transparent testa4 </it>(<it>tt4</it>) to compare systematically the metabolomic correlations in samples of roots and aerial parts. We then applied graph clustering to the constructed correlation networks to extract densely connected metabolites and evaluated the clusters by biochemical-pathway enrichment analysis. We found that the number of significant correlations varied by tissue and genotype and that the obtained clusters were significantly enriched for metabolites included in biochemical pathways.</p> <p>Conclusions</p> <p>We demonstrate that the graph-clustering approach identifies tissue- and/or genotype-dependent metabolomic clusters related to the biochemical pathway. Metabolomic correlations complement information about changes in mean metabolite levels and may help to elucidate the organization of metabolically functional modules.</p

    Prioritising candidate genes causing QTL using hierarchical orthologous groups.

    Get PDF
    A key goal in plant biotechnology applications is the identification of genes associated to particular phenotypic traits (for example: yield, fruit size, root length). Quantitative Trait Loci (QTL) studies identify genomic regions associated with a trait of interest. However, to infer potential causal genes in these regions, each of which can contain hundreds of genes, these data are usually intersected with prior functional knowledge of the genes. This process is however laborious, particularly if the experiment is performed in a non-model species, and the statistical significance of the inferred candidates is typically unknown. This paper introduces QTLSearch, a method and software tool to search for candidate causal genes in QTL studies by combining Gene Ontology annotations across many species, leveraging hierarchical orthologous groups. The usefulness of this approach is demonstrated by re-analysing two metabolic QTL studies: one in Arabidopsis thaliana, the other in Oryza sativa subsp. indica. Even after controlling for statistical significance, QTLSearch inferred potential causal genes for more QTL than BLAST-based functional propagation against UniProtKB/Swiss-Prot, and for more QTL than in the original studies. QTLSearch is distributed under the LGPLv3 license. It is available to install from the Python Package Index (as qtlsearch), with the source available from https://bitbucket.org/alex-warwickvesztrocy/qtlsearch. Supplementary data are available at Bioinformatics online

    An in-silico & in-vitro tournament for protein engineering

    Get PDF
    Please click Additional Files below to see the full abstrac

    The Chemical Translation Serviceā€”a web-based tool to improve standardization of metabolomic reports

    Get PDF
    Summary: Metabolomic publications and databases use different database identifiers or even trivial names which disable queries across databases or between studies. The best way to annotate metabolites is by chemical structures, encoded by the International Chemical Identifier code (InChI) or InChIKey. We have implemented a web-based Chemical Translation Service that performs batch conversions of the most common compound identifiers, including CAS, CHEBI, compound formulas, Human Metabolome Database HMDB, InChI, InChIKey, IUPAC name, KEGG, LipidMaps, PubChem CID+SID, SMILES and chemical synonym names. Batch conversion downloads of 1410 CIDs are performed in 2.5 min. Structures are automatically displayed

    Prioritising candidate genes causing QTL using hierarchical orthologous groups

    Get PDF
    Motivation: A key goal in plant biotechnology applications is the identification of genes associated to particular phenotypic traits (for example: yield, fruit size, root length). Quantitative Trait Loci (QTL) studies identify genomic regions associated with a trait of interest. However, to infer potential causal genes in these regions, each of which can contain hundreds of genes, these data are usually intersected with prior functional knowledge of the genes. This process is however laborious, particularly if the experiment is performed in a non-model species, and the statistical significance of the inferred candidates is typically unknown. // Results: This paper introduces QTLSearch, a method and software tool to search for candidate causal genes in QTL studies by combining Gene Ontology annotations across many species, leveraging hierarchical orthologous groups. The usefulness of this approach is demonstrated by re-analysing two metabolic QTL studies: one in Arabidopsis thaliana, the other in Oryza sativa subsp. indica. Even after controlling for statistical significance, QTLSearch inferred potential causal genes for more QTL than BLAST-based functional propagation against UniProtKB/Swiss-Prot, and for more QTL than in the original studies. // Availability and implementation: QTLSearch is distributed under the LGPLv3 license. It is available to install from the Python Package Index (as qtlsearch), with the source available from https://bitbucket.org/alex-warwickvesztrocy/qtlsearch

    Determining novel functions of Arabidopsis 14-3-3 proteins in central metabolic processes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>14-3-3 proteins are considered master regulators of many signal transduction cascades in eukaryotes. In plants, 14-3-3 proteins have major roles as regulators of nitrogen and carbon metabolism, conclusions based on the studies of a few specific 14-3-3 targets.</p> <p>Results</p> <p>In this study, extensive novel roles of 14-3-3 proteins in plant metabolism were determined through combining the parallel analyses of metabolites and enzyme activities in 14-3-3 overexpression and knockout plants with studies of protein-protein interactions. Decreases in the levels of sugars and nitrogen-containing-compounds and in the activities of known 14-3-3-interacting-enzymes were observed in 14-3-3 overexpression plants. Plants overexpressing 14-3-3 proteins also contained decreased levels of malate and citrate, which are intermediate compounds of the tricarboxylic acid (TCA) cycle. These modifications were related to the reduced activities of isocitrate dehydrogenase and malate dehydrogenase, which are key enzymes of TCA cycle. In addition, we demonstrated that 14-3-3 proteins interacted with one isocitrate dehydrogenase and two malate dehydrogenases. There were also changes in the levels of aromatic compounds and the activities of shikimate dehydrogenase, which participates in the biosynthesis of aromatic compounds.</p> <p>Conclusion</p> <p>Taken together, our findings indicate that 14-3-3 proteins play roles as crucial tuners of multiple primary metabolic processes including TCA cycle and the shikimate pathway.</p
    • ā€¦
    corecore