1,444 research outputs found

    Learning the Regulatory Code of Gene Expression

    Get PDF
    Data-driven machine learning is the method of choice for predicting molecular phenotypes from nucleotide sequence, modeling gene expression events including protein-DNA binding, chromatin states as well as mRNA and protein levels. Deep neural networks automatically learn informative sequence representations and interpreting them enables us to improve our understanding of the regulatory code governing gene expression. Here, we review the latest developments that apply shallow or deep learning to quantify molecular phenotypes and decode the cis-regulatory grammar from prokaryotic and eukaryotic sequencing data. Our approach is to build from the ground up, first focusing on the initiating protein-DNA interactions, then specific coding and non-coding regions, and finally on advances that combine multiple parts of the gene and mRNA regulatory structures, achieving unprecedented performance. We thus provide a quantitative view of gene expression regulation from nucleotide sequence, concluding with an information-centric overview of the central dogma of molecular biology

    The non-coding genome in Autism Spectrum Disorders

    Get PDF
    Autism Spectrum Disorders (ASD) are a group of neurodevelopmental disorders (NDDs) characterized by difficulties in social interaction and communication, repetitive behavior, and restricted interests. While ASD have been proven to have a strong genetic component, current research largely focuses on coding regions of the genome. However, non-coding DNA, which makes up for ∼99% of the human genome, has recently been recognized as an important contributor to the high heritability of ASD, and novel sequencing technologies have been a milestone in opening up new directions for the study of the gene regulatory networks embedded within the non-coding regions. Here, we summarize current progress on the contribution of non-coding alterations to the pathogenesis of ASD and provide an overview of existing methods allowing for the study of their functional relevance, discussing potential ways of unraveling ASD's “missing heritability”S

    In silico Analysis of the Entire P. glaucum Genome Identifies Regulatory Genes of the bZIP Family Modulated in Response Pathways to Water Stress

    Get PDF
    The literature reviewed places P. glaucum as a cereal characterized by its nutritional quality and high tolerance to drought stress. However, very little is known about the fine mechanism it uses in response to water stress. To try to clarify this point, we carried out an analysis of the modulation of the expression of regulatory genes of the FT bZIP family. A full genome screening of P. glaucum identified 52 putative FT bZIPs, identifying 9 FT PgbZIP differentially expressed under water stress conditions filtered from RNA-seq data from a Transcriptome deposited at the NCBI. The promoter regions of these genes presented multiple elements or cis ABREs and DRE motifs, thus suggesting their double modulated participation in the slow or adaptive response and in the rapid response of this cereal to water stress. The findings of this study provide complementary data for the understanding of the mechanism behind the adaptation of P. glaucum under water stress, and may be relevant for molecular applications of potential crops.Fil: Garay Farías, Laura Beatriz. Universidad Federal de la Integración Latinoamericana; BrasilFil: Litwiñiuk, Sergio Leandro. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Nordeste. Instituto de Biología Subtropical. Instituto de Biología Subtropical - Nodo Posadas | Universidad Nacional de Misiones. Instituto de Biología Subtropical. Instituto de Biología Subtropical - Nodo Posadas; ArgentinaFil: Rojas, Cristian Antonio. Universidad Federal de la Integración Latinoamericana; Brasi

    Categorization of 77 dystrophin exons into 5 groups by a decision tree using indexes of splicing regulatory factors as decision markers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Duchenne muscular dystrophy, a fatal muscle-wasting disease, is characterized by dystrophin deficiency caused by mutations in the <it>dystrophin </it>gene. Skipping of a target <it>dystrophin </it>exon during splicing with antisense oligonucleotides is attracting much attention as the most plausible way to express dystrophin in DMD. Antisense oligonucleotides have been designed against splicing regulatory sequences such as splicing enhancer sequences of target exons. Recently, we reported that a chemical kinase inhibitor specifically enhances the skipping of mutated <it>dystrophin </it>exon 31, indicating the existence of exon-specific splicing regulatory systems. However, the basis for such individual regulatory systems is largely unknown. Here, we categorized the <it>dystrophin </it>exons in terms of their splicing regulatory factors.</p> <p>Results</p> <p>Using a computer-based machine learning system, we first constructed a decision tree separating 77 authentic from 14 known cryptic exons using 25 indexes of splicing regulatory factors as decision markers. We evaluated the classification accuracy of a novel cryptic exon (exon 11a) identified in this study. However, the tree mislabeled exon 11a as a true exon. Therefore, we re-constructed the decision tree to separate all 15 cryptic exons. The revised decision tree categorized the 77 authentic exons into five groups. Furthermore, all nine disease-associated novel exons were successfully categorized as exons, validating the decision tree. One group, consisting of 30 exons, was characterized by a high density of exonic splicing enhancer sequences. This suggests that AOs targeting splicing enhancer sequences would efficiently induce skipping of exons belonging to this group.</p> <p>Conclusions</p> <p>The decision tree categorized the 77 authentic exons into five groups. Our classification may help to establish the strategy for exon skipping therapy for Duchenne muscular dystrophy.</p
    corecore