41 research outputs found
miRGen: a database for the study of animal microRNA genomic organization and function
miRGen is an integrated database of (i) positional relationships between animal miRNAs and genomic annotation sets and (ii) animal miRNA targets according to combinations of widely used target prediction programs. A major goal of the database is the study of the relationship between miRNA genomic organization and miRNA function. This is made possible by three integrated and user friendly interfaces. The Genomics interface allows the user to explore where whole-genome collections of miRNAs are located with respect to UCSC genome browser annotation sets such as Known Genes, Refseq Genes, Genscan predicted genes, CpG islands and pseudogenes. These miRNAs are connected through the Targets interface to their experimentally supported target genes from TarBase, as well as computationally predicted target genes from optimized intersections and unions of several widely used mammalian target prediction programs. Finally, the Clusters interface provides predicted miRNA clusters at any given inter-miRNA distance and provides specific functional information on the targets of miRNAs within each cluster. All of these unique features of miRGen are designed to facilitate investigations into miRNA genomic organization, co-transcription and targeting. miRGen can be freely accessed at
Recommended from our members
Sustained-input switches for transcription factors and microRNAs are central building blocks of eukaryotic gene circuits
WaRSwap is a randomization algorithm that for the first time provides a practical network motif discovery method for large multi-layer networks, for example those that include transcription factors, microRNAs, and non-regulatory protein coding genes. The algorithm is applicable to systems with tens of thousands of genes, while accounting for critical aspects of biological networks, including self-loops, large hubs, and target rearrangements. We validate WaRSwap on a newly inferred regulatory network from Arabidopsis thaliana, and compare outcomes on published Drosophila and human networks. Specifically, sustained input switches are among the few over-represented circuits across this diverse set of eukaryotes.Keywords: gene regulation, transcription factor, microRNA, network moti
Recommended from our members
TIPR: Transcription Initiation Pattern Recognition on a Genome Scale
MOTIVATION: The computational identification of gene transcription start sites (TSSs) can provide insights into the regulation and function of genes without performing expensive experiments, particularly in organisms with incomplete annotations. High-resolution general-purpose TSS prediction remains a challenging problem, with little recent progress on the identification and differentiation of TSSs which are arranged in different spatial patterns along the chromosome.
RESULTS: In this work, we present the Transcription Initiation Pattern Recognizer (TIPR), a sequence-based machine learning model that identifies TSSs with high accuracy and resolution for multiple spatial distribution patterns along the genome, including broadly distributed TSS patterns that have previously been difficult to characterize. TIPR predicts not only the locations of TSSs but also the expected spatial initiation pattern each TSS will form along the chromosomeāa novel capability for TSS prediction algorithms. As spatial initiation patterns are associated with spatiotemporal expression patterns and gene function, this capability has the potential to improve gene annotations and our understanding of the regulation of transcription initiation. The high nucleotide resolution of this model locates TSSs within 10 nucleotides or less on average.
AVAILABILITY AND IMPLEMENTATION: Model source code is made available online at http://megraw.cgrb.oregonstate.edu/software/TIPR/.This is an author's peer-reviewed final manuscript, as accepted by the publisher. The published article is copyrighted by the author(s) and published by Oxford University Press. It can be found at: http://bioinformatics.oxfordjournals.org/,Supplementary information: Supplementary data are available at Bioinformatics online. Availability and implementation: Model source code is made available online at http://megraw.cgrb.oregonstate.edu/software/TIPR/
Isoform Specific Gene Auto-Regulation via miRNAs: A Case Study on miR-128b and ARPP-21
In this study, we investigate whether miRNAs located within āhostā protein-coding genes may regulate the expression of their host genes. We find that 43 of 174 miRNAs encoded within RefSeq genes are predicted to target their host genes. Statistical analysis of this phenomenon suggests that gene auto-regulation via miRNAs may be under positive selective pressure. Our analysis also indicates that several of the 43 miRNAs have a much lower expectation of targeting their host genes by chance than others. Among these examples, we identify miR-128b:ARPP-21 (cyclic AMP-regulated phosphoprotein, 21 kD) as a case in which both the miRNA and the target site are also evolutionarily conserved. We provide experimental support for this miRNA:target interaction via reporter silencing assays, and present evidence that this isoform-specific gene auto-regulation has been preserved in vertebrate species in order to prevent detrimental consequences of ARPP-21 over-expression in brain
miRGen 2.0: a database of microRNA genomic information and regulation
MicroRNAs are small, non-protein coding RNA molecules known to regulate the expression of genes by binding to the 3ā²UTR region of mRNAs. MicroRNAs are produced from longer transcripts which can code for more than one mature miRNAs. miRGen 2.0 is a database that aims to provide comprehensive information about the position of human and mouse microRNA coding transcripts and their regulation by transcription factors, including a unique compilation of both predicted and experimentally supported data. Expression profiles of microRNAs in several tissues and cell lines, single nucleotide polymorphism locations, microRNA target prediction on protein coding genes and mapping of miRNA targets of co-regulated miRNAs on biological pathways are also integrated into the database and user interface. The miRGen database will be continuously maintained and freely available at http://www.microrna.gr/mirgen/
Recommended from our members
Paired-End Analysis of Transcription Start Sites in Arabidopsis Reveals Plant-Specific Promoter Signatures
Understanding plant gene promoter architecture has long been a challenge due to the lack of relevant large-scale data sets and analysis methods. Here we present a publicly available, large-scale transcription start site (TSS) dataset in plants using a high-resolution method for analysis of 5ā ends of mRNA transcripts. Our dataset is produced using the Paired-End Analysis of Transcription Start Sites (PEAT) protocol, providing millions of TSS locations from wild-type Col-0 Arabidopsis whole root samples. Using this dataset, we grouped TSS reads into āTSS tag clustersā and categorized clusters into three spatial initiation patterns: narrow peak, broad with peak, and weak peak. We then designed a machine learning model that predicts the presence of TSS tag clusters with outstanding sensitivity and specificity for all three initiation patterns. We used this model to analyze the transcription factor binding site content of promoters exhibiting these initiation patterns. In contrast to the canonical notions of TATA-containing and more broad āTATA-lessā promoters, the model shows that, in plants, the vast majority of transcription start sites are TATA-free, and are defined by a large compendium of known DNA sequence binding elements. We present results on the usage of these elements, and provide our Plant PEAT Peaks (3PEAT) model that predicts the presence of TSSs directly from sequence.Keywords: Transcription factor, Start site, Arabidopsis, MicroRNA, Gene regulatio
The cyclophilin A DIAGEOTROPICA gene affects auxin transport in both root and shoot to control lateral root formation
Cyclophilin A is a conserved peptidyl-prolyl cis-trans isomerase (PPIase) best known as the cellular receptor of the immunosuppressant cyclosporine A. Despite significant effort, evidence of developmental functions of cyclophilin A in non-plant systems has remained obscure. Mutations in a tomato (Solanum lycopersicum) cyclophilin A ortholog, DIAGEOTROPICA (DGT), have been shown to abolish the organogenesis of lateral roots; however, a mechanistic explanation of the phenotype is lacking. Here, we show that the dgt mutant lacks auxin maxima relevant to priming and specification of lateral root founder cells. DGT is expressed in shoot and root, and localizes to both the nucleus and cytoplasm during lateral root organogenesis. Mutation of ENTIRE/IAA9, a member of the auxin-responsive Aux/IAA protein family of transcriptional repressors, partially restores the inability of dgt to initiate lateral root primordia but not the primordia outgrowth. By comparison, grafting of a wild-type scion restores the process of lateral root formation, consistent with participation of a mobile signal. Antibodies do not detect movement of the DGT protein into the dgt rootstock; however, experiments with radiolabeled auxin and an auxin-specific microelectrode demonstrate abnormal auxin fluxes. Functional studies of DGT in heterologous yeast and tobacco-leaf auxin-transport systems demonstrate that DGT negatively regulates PIN-FORMED (PIN) auxin efflux transporters by affecting their plasma membrane localization. Studies in tomato support complex effects of the dgt mutation on PIN expression level, expression domain and plasma membrane localization. Our data demonstrate that DGT regulates auxin transport in lateral root formation
The next generation of training for arabidopsis researchers: Bioinformatics and Quantitative Biology
It has been more than 50 years since Arabidopsis (Arabidopsis thaliana) was first introduced as a model organism to understand basic processes in plant biology. A well-organized scientific community has used this small reference plant species to make numerous fundamental plant biology discoveries (Provart et al., 2016). Due to an extremely well-annotated genome and advances in high-throughput sequencing, our understanding of this organism and other plant species has become even more intricate and complex. Computational resources, including CyVerse,3 Araport,4 The Arabidopsis Information Resource (TAIR),5 and BAR,6 have further facilitated novel findings with just the click of a mouse. As we move toward understanding biological systems, Arabidopsis researchers will need to use more quantitative and computational approaches to extract novel biological findings from these data. Here, we discuss guidelines, skill sets, and core competencies that should be considered when developing curricula or training undergraduate or graduate students, postdoctoral researchers, and faculty. A selected case study provides more specificity as to the concrete issues plant biologists face and how best to address such challenges
Arabidopsis bioinformatics resources: the current state, challenges, and priorities for the future
Effective research, education, and outreach efforts by the Arabidopsis thaliana community, as well as other scientific communities that depend on Arabidopsis resources, depend vitally on easily available and publicly-shared resources. These resources include reference genome sequence data and an ever-increasing number of diverse data sets and data types. TAIR (The Arabidopsis Information Resource) and Araport (originally named the Arabidopsis Information Portal) are community informatics resources that provide tools, data, and applications to the more than 30,000 researchers worldwide that use in their work either Arabidopsis as a primary system of study or data derived from Arabidopsis. Four years after Araportās establishment, the IAIC held another workshop to evaluate the current status of Arabidopsis Informatics and chart a course for future research and development. The workshop focused on several challenges, including the need for reliable and current annotation, community-defined common standards for data and metadata, and accessible and user-friendly repositories / tools / methods for data integration and visualization. Solutions envisioned included (1) a centralized annotation authority to coalesce annotation from new groups, establish a consistent naming scheme, distribute this format regularly and frequently, and encourage and enforce its adoption. (2) Standards for data and metadata formats, which are essential, but challenging when comparing across diverse genotypes and in areas with less-established standards (e.g. phenomics, metabolomics). Community-established guidelines need to be developed. (3) A searchable, central repository for analysis and visualization tools. Improved versioning and user access would make tools more accessible. Workshop participants proposed a āone-stop shopā website, an Arabidopsis āSuper-Portalā to link tools, data resources, programmatic standards, and best practice descriptions for each data type. This must have community buy-in and participation in its establishment and development to encourage adoption
Regulation of microRNA transcription
Gene regulation is now widely accepted to happen both at the transcriptional and the post-transcriptional level, and microRNAs (miRNAs) have emerged as critical modulators of gene expression in plants and animals. How do these transcriptional and post-transcriptional mechanisms interact to define robust regulatory networks? A major barrier to this question has historically been a lack of knowledge about the regulation of miRNAs themselves. In this dissertation we seek to characterize the promoters of miRNAs which are transcribed by RNA polymerase II, and to understand the similarities and differences between miRNA and protein-coding gene regulation. We begin by building a method to identify miRNA promoter elements using known transcription factor binding motifs, and apply this technique to perform a comparative analysis of miRNA and protein-coding gene promoters in Arabidopsis. We find that motifs associated with five transcription factors are reported which show evidence of over-representation in miRNA promoter regions relative to the promoter regions of protein-coding genes. In addition to the core promoter element TATA-box, these factors relate to regulation by plant hormones and plant responsiveness to drought. A similar comparison in animal organisms is impeded by the problem of so few currently known animal miRNA transcription start sites, and we therefore develop methods to address this problem. We create a toolset for the construction an animal miRNA promoter region tiling arrays, and design a transcription start site detection algorithm for non-coding primary transcripts of mammalian miRNAs. Finally, we investigate a special case of gene regulation both by and of intronic miRNAs. Since computational and experimental evidence increasingly supports the idea that intronic miRNAs are transcribed along with their host genes, we investigate whether such miRNAs may regulate the expression of the host genes in which they are located. Through this analysis we identify an intronic miRNA which may be involved in neurodegenerative disease, and provide experimental evidence that this miRNA may suppress protein production of the very gene along with which it is transcribed