Search CORE

12,770 research outputs found

Mining Functional Elements in Messenger RNAs: Overview, Challenges, and Perspectives

Author: Ahmed Firoz
Benedito Vagner A.
Zhao Patrick Xuechun
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2011
Field of study

Eukaryotic messenger RNA (mRNA) contains not only protein-coding regions but also a plethora of functional cis-elements that influence or coordinate a number of regulatory aspects of gene expression, such as mRNA stability, splicing forms, and translation rates. Understanding the rules that apply to each of these element types (e.g., whether the element is defined by primary or higher-order structure) allows for the discovery of novel mechanisms of gene expression as well as the design of transcripts with controlled expression. Bioinformatics plays a major role in creating databases and finding non-evident patterns governing each type of eukaryotic functional element. Much of what we currently know about mRNA regulatory elements in eukaryotes is derived from microorganism and animal systems, with the particularities of plant systems lagging behind. In this review, we provide a general introduction to the most well-known eukaryotic mRNA regulatory motifs (splicing regulatory elements, internal ribosome entry sites, iron-responsive elements, AU-rich elements, zipcodes, and polyadenylation signals) and describe available bioinformatics resources (databases and analysis tools) to analyze eukaryotic transcripts in search of functional elements, focusing on recent trends in bioinformatics methods and tool development. We also discuss future directions in the development of better computational tools based upon current knowledge of these functional elements. Improved computational tools would advance our understanding of the processes underlying gene regulations. We encourage plant bioinformaticians to turn their attention to this subject to help identify novel mechanisms of gene expression regulation using RNA motifs that have potentially evolved or diverged in plant species

Crossref

Directory of Open Access Journals

PubMed Central

Frontiers - Publisher Connector

The Research Repository @ WVU (West Virginia University)

Learning the Regulatory Code of Gene Expression

Author: Buric Filip
Garcia Victor
Kokina Mariia
Zelezniak Aleksej
Zrimec Jan
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2021
Field of study

Data-driven machine learning is the method of choice for predicting molecular phenotypes from nucleotide sequence, modeling gene expression events including protein-DNA binding, chromatin states as well as mRNA and protein levels. Deep neural networks automatically learn informative sequence representations and interpreting them enables us to improve our understanding of the regulatory code governing gene expression. Here, we review the latest developments that apply shallow or deep learning to quantify molecular phenotypes and decode the cis-regulatory grammar from prokaryotic and eukaryotic sequencing data. Our approach is to build from the ground up, first focusing on the initiating protein-DNA interactions, then specific coding and non-coding regions, and finally on advances that combine multiple parts of the gene and mRNA regulatory structures, achieving unprecedented performance. We thus provide a quantitative view of gene expression regulation from nucleotide sequence, concluding with an information-centric overview of the central dogma of molecular biology

PubMed Central

Chalmers Research

ZHAW digitalcollection

Online Research Database In Technology

Perturbation analysis analyzed—mathematical modeling of intact and perturbed gene regulatory circuits for animal development

Author: Ben-Tabou de-Leon Smadar
Publication venue: 'Elsevier BV'
Publication date: 15/08/2010
Field of study

Gene regulatory networks for animal development are the underlying mechanisms controlling cell fate specification and differentiation. The architecture of gene regulatory circuits determines their information processing properties and their developmental function. It is a major task to derive realistic network models from exceedingly advanced high throughput experimental data. Here we use mathematical modeling to study the dynamics of gene regulatory circuits to advance the ability to infer regulatory connections and logic function from experimental data. This study is guided by experimental methodologies that are commonly used to study gene regulatory networks that control cell fate specification. We study the effect of a perturbation of an input on the level of its downstream genes and compare between the cis-regulatory execution of OR and AND logics. Circuits that initiate gene activation and circuits that lock on the expression of genes are analyzed. The model improves our ability to analyze experimental data and construct from it the network topology. The model also illuminates information processing properties of gene regulatory circuits for animal development

Elsevier - Publisher Connector

Caltech Authors

Differences in transcription between free-living and CO_2-activated third-stage larvae of Haemonchus contortus

Author: Aleman-Meza Boanerges
Campbell Bronwyn E.
Cantacessi Cinzia
Gasser Robin B.
Hall Ross S.
Jex Aaron R.
Loukas Alex
Presidente Paul J. A.
Sternberg Paul W.
Young Neil D.
Zawadzki Jodi L.
Zhong Weiwei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background: The disease caused by Haemonchus contortus, a blood-feeding nematode of small ruminants, is of major economic importance worldwide. The infective third-stage larva (L3) of this gastric nematode is enclosed in a cuticle (sheath) and, once ingested with herbage by the host, undergoes an exsheathment process that marks the transition from the free-living (L3) to the parasitic (xL3) stage. This study explored changes in gene transcription associated with this transition and predicted, based on comparative analysis, functional roles for key transcripts in the metabolic pathways linked to larval development. Results: Totals of 101,305 (L3) and 105,553 (xL3) expressed sequence tags (ESTs) were determined using 454 sequencing technology, and then assembled and annotated; the most abundant transcripts encoded transthyretin-like, calcium-binding EF-hand, NAD(P)-binding and nucleotide-binding proteins as well as homologues of Ancylostoma-secreted proteins (ASPs). Using an in silico-subtractive analysis, 560 and 685 sequences were shown to be uniquely represented in the L3 and xL3 stages, respectively; the transcripts encoded ribosomal proteins, collagens and elongation factors (in L3), and mainly peptidases and other enzymes of amino acid catabolism (in xL3). Caenorhabditis elegans orthologues of transcripts that were uniquely transcribed in each L3 and xL3 were predicted to interact with a total of 535 other genes, all of which were involved in embryonic development. Conclusion: The present study indicated that some key transcriptional alterations taking place during the transition from the L3 to the xL3 stage of H. contortus involve genes predicted to be linked to the development of neuronal tissue (L3 and xL3), formation of the cuticle (L3) and digestion of host haemoglobin (xL3). Future efforts using next-generation sequencing and bioinformatic technologies should provide the efficiency and depth of coverage required for the determination of the complete transcriptomes of different developmental stages and/or tissues of H. contortus as well as the genome of this important parasitic nematode. Such advances should lead to a significantly improved understanding of the molecular biology of H. contortus and, from an applied perspective, to novel methods of intervention

ResearchOnline@JCU

Crossref

Springer - Publisher Connector

ResearchOnline at James Cook University

Directory of Open Access Journals

PubMed Central

Caltech Authors

DSpace at Rice University

University of Melbourne Institutional Repository

Bioinformatic analyses of mammalian 5'-UTR sequence properties of mRNAs predicts alternative translation initiation sites

Author: Drudge Thomas M
Hook Vivian
Valafar Faramarz
Wegrzyn Jill L
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Utilization of alternative initiation sites for protein translation directed by non-AUG codons in mammalian mRNAs is observed with increasing frequency. Alternative initiation sites are utilized for the synthesis of important regulatory proteins that control distinct biological functions. It is, therefore, of high significance to define the parameters that allow accurate bioinformatic prediction of alternative translation initiation sites (aTIS). This study has investigated 5'-UTR regions of mRNAs to define consensus sequence properties and structural features that allow identification of alternative initiation sites for protein translation. Results Bioinformatic evaluation of 5'-UTR sequences of mammalian mRNAs was conducted for classification and identification of alternative translation initiation sites for a group of mRNA sequences that have been experimentally demonstrated to utilize alternative non-AUG initiation sites for protein translation. These are represented by the codons CUG, GUG, UUG, AUA, and ACG for aTIS. The first phase of this bioinformatic analysis implements a classification tree that evaluated 5'-UTRs for unique consensus sequence features near the initiation codon, characteristics of 5'-UTR nucleotide sequences, and secondary structural features in a decision tree that categorizes mRNAs into those with potential aTIS, and those without. The second phase addresses identification of the aTIS codon and its location. Critical parameters of 5'-UTRs were assessed by an Artificial Neural Network (ANN) for identification of the aTIS codon and its location. ANNs have previously been used for the purpose of AUG start site prediction and are applicable in complex. ANN analyses demonstrated that multiple properties were required for predicting aTIS codons; these properties included unique consensus nucleotide sequences at positions -7 and -6 combined with positions -3 and +4, 5'-UTR length, ORF length, predicted secondary structures, free energy features, upstream AUGs, and G/C ratio. Importantly, combined results of the classification tree and the ANN analyses provided highly accurate bioinformatic predictions of alternative translation initiation sites. Conclusion This study has defined the unique properties of 5'-UTR sequences of mRNAs for successful bioinformatic prediction of alternative initiation sites utilized in protein translation. The ability to define aTIS through the described bioinformatic analyses can be of high importance for genomic analyses to provide full predictions of translated mammalian and human gene products required for cellular functions in health and disease.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Recommended from our members

Gene Regulatory Compatibility in Bacteria: Consequences for Synthetic Biology and Evolution

Author: Johns Nathan Isaac
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

Mechanistic understanding of gene regulation is crucial for rational engineering of new genetic systems through synthetic biology. Genetic engineering efforts in new organisms are often hampered by a lack of knowledge about how regulatory components function in new host contexts. This dissertation focuses on efforts to overcome these challenges through the development of generalizable experimental methods for studying the behavior of DNA regulatory sequences in diverse species at large-scale. Chapter 2 describes experimental approaches for quantitatively assessing the functions of thousands of diverse natural regulatory sequences through a combination of metagenomic mining, high-throughput DNA synthesis and deep sequencing. By employing these methods in three distinct bacterial species, we revealed striking functional differences in gene regulatory capacity. We identified regulatory sequences with activity levels with activity levels spanning several orders of magnitude, which will aid in efforts to engineer diverse bacterial species. We also demonstrate functional species-selective gene circuits with programmable host behaviors that may be useful for microbial community engineering. In Chapter 3 we provide evidence for the evolution of altered stringency in σ70-mediated transcriptional activation based on patterns of initiation and activity from promoters of diverse compositions. We show that the contrast in GC content between a regulatory element and the host genome dictates both the likelihood and the magnitude of expression. We also discuss the potential implications of this proposed mechanism on horizontal gene transfer. The next two chapters focus on efforts aimed at extending the high-throughput methods described in earlier chapters to new organisms. Chapter 4 presents an in vitro approach for multiplexed gene expression profiling. Through the development and use of cell-free expression systems made from diverse bacteria, it was possible to rapidly acquire thousands of transcriptional measurements in small volume reactions, enabling functional comparisons of regulatory sequence function across multiple species. In Chapter 5 we characterize the restriction-modification system repertoires of several commensal bacterial species. We also describe ongoing efforts to develop methods for bypassing these systems in order to increase transformation efficiencies in species that are difficult or impossible to transform using current approaches

Columbia University Academic Commons

Improvement in the prediction of the translation initiation site through balancing methods, inclusion of acquired knowledge and addition of features to sequences of mRNA

Author: de Souza Teixeira Felipe Carvalho
Nobre Cristiane Neri
Ortega José Miguel
Silva Lívia Márcia
Zárate Luis Enrique
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The accurate prediction of the initiation of translation in sequences of mRNA is an important activity for genome annotation. However, obtaining an accurate prediction is not always a simple task and can be modeled as a problem of classification between positive sequences (protein codifiers) and negative sequences (non-codifiers). The problem is highly imbalanced because each molecule of mRNA has a unique translation initiation site and various others that are not initiators. Therefore, this study focuses on the problem from the perspective of balancing classes and we present an undersampling balancing method, M-clus, which is based on clustering. The method also adds features to sequences and improves the performance of the classifier through the inclusion of knowledge obtained by the model, called InAKnow. Results Through this methodology, the measures of performance used (accuracy, sensitivity, specificity and adjusted accuracy) are greater than 93% for the <it>Mus musculus</it> and <it>Rattus norvegicus</it> organisms, and varied between 72.97% and 97.43% for the other organisms evaluated: <it>Arabidopsis thaliana</it>, <it>Caenorhabditis elegans</it>, <it>Drosophila melanogaster</it>, <it>Homo sapiens</it>, <it>Nasonia vitripennis</it>. The precision increases significantly by 39% and 22.9% for <it>Mus musculus</it> and <it>Rattus norvegicus</it>, respectively, when the knowledge obtained by the model is included. For the other organisms, the precision increases by between 37.10% and 59.49%. The inclusion of certain features during training, for example, the presence of ATG in the upstream region of the Translation Initiation Site, improves the rate of sensitivity by approximately 7%. Using the M-Clus balancing method generates a significant increase in the rate of sensitivity from 51.39% to 91.55% (<it>Mus musculus</it>) and from 47.45% to 88.09% (<it>Rattus norvegicus</it>). Conclusions In order to solve the problem of TIS prediction, the results indicate that the methodology proposed in this work is adequate, particularly when using the concept of acquired knowledge which increased the accuracy in all databases evaluated.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central