17,798 research outputs found
Systematic identification of functional plant modules through the integration of complementary data sources
A major challenge is to unravel how genes interact and are regulated to exert specific biological functions. The integration of genome-wide functional genomics data, followed by the construction of gene networks, provides a powerful approach to identify functional gene modules. Large-scale expression data, functional gene annotations, experimental protein-protein interactions, and transcription factor-target interactions were integrated to delineate modules in Arabidopsis (Arabidopsis thaliana). The different experimental input data sets showed little overlap, demonstrating the advantage of combining multiple data types to study gene function and regulation. In the set of 1,563 modules covering 13,142 genes, most modules displayed strong coexpression, but functional and cis-regulatory coherence was less prevalent. Highly connected hub genes showed a significant enrichment toward embryo lethality and evidence for cross talk between different biological processes. Comparative analysis revealed that 58% of the modules showed conserved coexpression across multiple plants. Using module-based functional predictions, 5,562 genes were annotated, and an evaluation experiment disclosed that, based on 197 recently experimentally characterized genes, 38.1% of these functions could be inferred through the module context. Examples of confirmed genes of unknown function related to cell wall biogenesis, xylem and phloem pattern formation, cell cycle, hormone stimulus, and circadian rhythm highlight the potential to identify new gene functions. The module-based predictions offer new biological hypotheses for functionally unknown genes in Arabidopsis (1,701 genes) and six other plant species (43,621 genes). Furthermore, the inferred modules provide new insights into the conservation of coexpression and coregulation as well as a starting point for comparative functional annotation
MorphDB : prioritizing genes for specialized metabolism pathways and gene ontology categories in plants
Recent times have seen an enormous growth of "omics" data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes). A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins) are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named "MORPH bulk" (https://github.com/arzwa/morph-bulk), for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest
Recommended from our members
Evidence for DNA-mediated nuclear compartmentalization distinct from phase separation.
RNA Polymerase II (Pol II) and transcription factors form concentrated hubs in cells via multivalent protein-protein interactions, often mediated by proteins with intrinsically disordered regions. During Herpes Simplex Virus infection, viral replication compartments (RCs) efficiently enrich host Pol II into membraneless domains, reminiscent of liquid-liquid phase separation. Despite sharing several properties with phase-separated condensates, we show that RCs operate via a distinct mechanism wherein unrestricted nonspecific protein-DNA interactions efficiently outcompete host chromatin, profoundly influencing the way DNA-binding proteins explore RCs. We find that the viral genome remains largely nucleosome-free, and this increase in accessibility allows Pol II and other DNA-binding proteins to repeatedly visit nearby DNA binding sites. This anisotropic behavior creates local accumulations of protein factors despite their unrestricted diffusion across RC boundaries. Our results reveal underappreciated consequences of nonspecific DNA binding in shaping gene activity, and suggest additional roles for chromatin in modulating nuclear function and organization
Identification of microRNAs and their targets in Finger millet by high throughput sequencing
MicroRNAs are short non-coding RNAs which play an important role in regulating gene expression by mRNA cleavage or by translational repression. The majority of identified miRNAs were evolutionarily conserved; however, others expressed in a species-specific manner. Finger millet is an important cereal crop; nonetheless, no practical information is available on microRNAs to date. In this study, we have identified 95 conserved microRNAs belonging to 39 families and 3 novel microRNAs by high throughput sequencing. For the identified conserved and novel miRNAs a total of 507 targets were predicted. 11 miRNAs were validated and tissue specificity was determined by stem loop RT-qPCR, Northern blot. GO analyses revealed targets of miRNA were involved in wide range of regulatory functions. This study implies large number of known and novel miRNAs found in Finger millet which may play important role in growth and development. © 2015 Elsevier B.V
Employing machine learning for reliable miRNA target identification in plants
<p>Abstract</p> <p>Background</p> <p>miRNAs are ~21 nucleotide long small noncoding RNA molecules, formed endogenously in most of the eukaryotes, which mainly control their target genes post transcriptionally by interacting and silencing them. While a lot of tools has been developed for animal miRNA target system, plant miRNA target identification system has witnessed limited development. Most of them have been centered around exact complementarity match. Very few of them considered other factors like multiple target sites and role of flanking regions.</p> <p>Result</p> <p>In the present work, a Support Vector Regression (SVR) approach has been implemented for plant miRNA target identification, utilizing position specific dinucleotide density variation information around the target sites, to yield highly reliable result. It has been named as p-TAREF (plant-Target Refiner). Performance comparison for p-TAREF was done with other prediction tools for plants with utmost rigor and where p-TAREF was found better performing in several aspects. Further, p-TAREF was run over the experimentally validated miRNA targets from species like <it>Arabidopsis</it>, <it>Medicago</it>, Rice and Tomato, and detected them accurately, suggesting gross usability of p-TAREF for plant species. Using p-TAREF, target identification was done for the complete Rice transcriptome, supported by expression and degradome based data. miR156 was found as an important component of the Rice regulatory system, where control of genes associated with growth and transcription looked predominant. The entire methodology has been implemented in a multi-threaded parallel architecture in Java, to enable fast processing for web-server version as well as standalone version. This also makes it to run even on a simple desktop computer in concurrent mode. It also provides a facility to gather experimental support for predictions made, through on the spot expression data analysis, in its web-server version.</p> <p>Conclusion</p> <p>A machine learning multivariate feature tool has been implemented in parallel and locally installable form, for plant miRNA target identification. The performance was assessed and compared through comprehensive testing and benchmarking, suggesting a reliable performance and gross usability for transcriptome wide plant miRNA target identification.</p
Recommended from our members
The how and why of lncRNA function: An innate immune perspective.
Next-generation sequencing has provided a more complete picture of the composition of the human transcriptome indicating that much of the "blueprint" is a vastness of poorly understood non-protein-coding transcripts. This includes a newly identified class of genes called long noncoding RNAs (lncRNAs). The lack of sequence conservation for lncRNAs across species meant that their biological importance was initially met with some skepticism. LncRNAs mediate their functions through interactions with proteins, RNA, DNA, or a combination of these. Their functions can often be dictated by their localization, sequence, and/or secondary structure. Here we provide a review of the approaches typically adopted to study the complexity of these genes with an emphasis on recent discoveries within the innate immune field. Finally, we discuss the challenges, as well as the emergence of new technologies that will continue to move this field forward and provide greater insight into the biological importance of this class of genes. This article is part of a Special Issue entitled: ncRNA in control of gene expression edited by Kotb Abdelmohsen
FootprintDB: Analysis of plant cis-regulatory elements, transcription factors, and binding interfaces
28 Pags.- 8 Figs. The definitive version is available at: http://link.springer.com/bookseries/7651 and http://link.springer.com/book/10.1007/978-1-4939-6396-6.FootprintDB is a database and search engine that compiles regulatory sequences from open access libraries of curated DNA cis-elements and motifs, and their associated transcription factors (TFs). It systematically annotates the binding interfaces of the TFs by exploiting protein-DNA complexes deposited in the Protein Data Bank. Each entry in footprintDB is thus a DNA motif linked to the protein sequence of the TF(s) known to recognize it, and in most cases, the set of predicted interface residues involved in specific recognition. This chapter explains step-by-step how to search for DNA motifs and protein sequences in footprintDB and how to focus the search to a particular organism. Two real-world examples are shown where this software was used to analyze transcriptional regulation in plants. Results are described with the aim of guiding users on their interpretation, and special attention is given to the choices users might face when performing similar analyzes.This work was funded by grant Euroinvestigación EUI2008-03612 under the framework of the Transnational (Germany, France, Spain) Cooperation within the PLANT-KBBE Initiative.Peer reviewe
Conserved noncoding sequences highlight shared components of regulatory networks in dicotyledonous plants
Conserved noncoding sequences (CNSs) in DNA are reliable pointers to regulatory elements controlling gene expression. Using a comparative genomics approach with four dicotyledonous plant species (Arabidopsis thaliana, papaya [Carica papaya], poplar [Populus trichocarpa], and grape [Vitis vinifera]), we detected hundreds of CNSs upstream of Arabidopsis genes. Distinct positioning, length, and enrichment for transcription factor binding sites suggest these CNSs play a functional role in transcriptional regulation. The enrichment of transcription factors within the set of genes associated with CNS is consistent with the hypothesis that together they form part of a conserved transcriptional network whose function is to regulate other transcription factors and control development. We identified a set of promoters where regulatory mechanisms are likely to be shared between the model organism Arabidopsis and other dicots, providing areas of focus for further research
Bioinformatics in China: A Personal Perspective
Biochemical Research MethodsMathematical & Computational BiologySCI(E)PubMed3EDITORIAL MATERIAL4e1000020
- …