5,822 research outputs found

    The contribution of Alu exons to the human proteome.

    Get PDF
    BackgroundAlu elements are major contributors to lineage-specific new exons in primate and human genomes. Recent studies indicate that some Alu exons have high transcript inclusion levels or tissue-specific splicing profiles, and may play important regulatory roles in modulating mRNA degradation or translational efficiency. However, the contribution of Alu exons to the human proteome remains unclear and controversial. The prevailing view is that exons derived from young repetitive elements, such as Alu elements, are restricted to regulatory functions and have not had adequate evolutionary time to be incorporated into stable, functional proteins.ResultsWe adopt a proteotranscriptomics approach to systematically assess the contribution of Alu exons to the human proteome. Using RNA sequencing, ribosome profiling, and proteomics data from human tissues and cell lines, we provide evidence for the translational activities of Alu exons and the presence of Alu exon derived peptides in human proteins. These Alu exon peptides represent species-specific protein differences between primates and other mammals, and in certain instances between humans and closely related primates. In the case of the RNA editing enzyme ADARB1, which contains an Alu exon peptide in its catalytic domain, RNA sequencing analyses of A-to-I editing demonstrate that both the Alu exon skipping and inclusion isoforms encode active enzymes. The Alu exon derived peptide may fine tune the overall editing activity and, in limited cases, the site selectivity of ADARB1 protein products.ConclusionsOur data indicate that Alu elements have contributed to the acquisition of novel protein sequences during primate and human evolution

    Parallel Genetics of Gene Regulatory Sequences in Caenorhabditis elegans

    Get PDF
    Wie regulatorische Sequenzen die Genexpression steuern, ist von grundlegender Bedeutung für die Erklärung von Phänotypen in Gesundheit und Krankheit. Die Funktion regulatorischer Sequenzen muss letztlich in ihrer genomischen Umgebung und in entwicklungs- oder gewebespezifischen Zusammenhängen verstanden werden. Da dies eine technische Herausforderung ist, wurden bisher nur wenige regulatorische Elemente in vivo charakterisiert. Hier verwenden wir Induktion von Cas9 und multiplexed-sgRNAs, um hunderte von Mutationen in Enhancern/Promotoren und 3′ UTRs von 16 Genen in C. elegans zu erzeugen. Wir quantifizieren die Auswirkungen von Mutationen auf Genexpression und Physiologie durch gezielte RNA- und DNA-Sequenzierung. Bei der Anwendung unseres Ansatzes auf den 3′ UTR von lin-41, bei der wir hunderte von Mutanten erzeugen, stellen wir fest, dass die beiden benachbarten Bindungsstellen für die miRNA let-7 die lin-41-Expression größtenteils unabhängig voneinander regulieren können, mit Hinweisen auf eine mögliche kompensatorische Interaktion. Schließlich verbinden wir regulatorische Genotypen mit phänotypischen Merkmalen für mehrere Gene. Unser Ansatz ermöglicht die parallele Analyse von genregulatorischen Sequenzen direkt in Tieren.How regulatory sequences control gene expression is fundamental for explaining phenotypes in health and disease. The function of regulatory sequences must ultimately be understood within their genomic environment and development- or tissue-specific contexts. Because this is technically challenging, few regulatory elements have been characterized in vivo. Here, we use inducible Cas9 and multiplexed guide RNAs to create hundreds of mutations in enhancers/promoters and 3′ UTRs of 16 genes in C. elegans. We quantify the impact of mutations on expression and physiology by targeted RNA sequencing and DNA sampling. When applying our approach to the lin-41 3′ UTR, generating hundreds of mutants, we find that the two adjacent binding sites for the miRNA let-7 can regulate lin-41 expression largely independently of each other, with indications of a compensatory interaction. Finally, we map regulatory genotypes to phenotypic traits for several genes. Our approach enables parallel analysis of gene regulatory sequences directly in animals

    Structural Variation Discovery and Genotyping from Whole Genome Sequencing: Methodology and Applications: A Dissertation

    Get PDF
    A comprehensive understanding about how genetic variants and mutations contribute to phenotypic variations and alterations entails experimental technologies and analytical methodologies that are able to detect genetic variants/mutations from various biological samples in a timely and accurate manner. High-throughput sequencing technology represents the latest achievement in a series of efforts to facilitate genetic variants discovery and genotyping and promises to transform the way we tackle healthcare and biomedical problems. The tremendous amount of data generated by this new technology, however, needs to be processed and analyzed in an accurate and efficient way in order to fully harness its potential. Structural variation (SV) encompasses a wide range of genetic variations with different sizes and generated by diverse mechanisms. Due to the technical difficulties of reliably detecting SVs, their characterization lags behind that of SNPs and indels. In this dissertation I presented two novel computational methods: one for detecting transposable element (TE) transpositions and the other for detecting SVs in general using a local assembly approach. Both methods are able to pinpoint breakpoint junctions at single-nucleotide resolution and estimate variant allele frequencies in the sample. I also applied those methods to study the impact of TE transpositions on the genomic stability, the inheritance patterns of TE insertions in the population and the molecular mechanisms and potential functional consequences of somatic SVs in cancer genomes

    RECOMBINATION HOTSPOTS IN SOYBEAN [GLYCINE MAX (L.) MERR.]

    Get PDF
    Recombination allows for the exchange of genetic material between two parents which plant breeders exploit to make new and improved varieties. This recombination is not distributed evenly across the chromosome. In crops, it mostly occurs in the euchromatic regions of the genome and even then, recombination is focused into recombination hotspots flanked by recombination cold spots. Understanding the distribution of these hotspots along with the sequence motifs associated with them may lead to methods that enable breeders to better exploit recombination in breeding. In chapter 1 background information on recombination, recombination hotspots detection methods, landscape of recombination (describe recombination patterns along the genome), and environmental influence on recombination hotspot locations are outlined. In chapter 2 recombination hotspots were mapped in two-biparental soybean [Glycine max (L.) Merr.] recombinant inbred line (RIL) populations, Williams crossed by Essex (WE) and Williams 82 crossed by PI479752 (WP). These populations consist of 922 RIL(WE) and 1,086 RIL (WP) and were genotyped with 50,000 SNP markers using the SoySNP50k Illumina Infinium assay. In chapter 3 the location of recombination hotspots in the USDA Soybean Germplasm Collection in three populations: wild (806), landraces (5396), and North American cultivars (563) are reported. Genotyping was conducted using the SoySNP50k Illumina Infinium assay. Germplasm hotspot locations were compared to results in chapter 2, two-biparental soybean recombinant inbred line (RIL) populations. In chapter 2 and 3 statistical tests were conducted for genome features association with hotspot locations based on logistical regression, discovered nucleotide motifs surrounding hotspot regions across the genome. Advisor: David L. Hyte

    Pervasive isoform-specific translational regulation via alternative transcription start sites in mammals

    Get PDF
    Transcription initiated at alternative sites can produce mRNA isoforms with different 5'UTRs, which are potentially subjected to differential translational regulation. However, the prevalence of such isoform-specific translational control across mammalian genomes is currently unknown. By combining polysome profiling with high-throughput mRNA 5' end sequencing, we directly measured the translational status of mRNA isoforms with distinct start sites. Among 9,951 genes expressed in mouse fibroblasts, we identified 4,153 showed significant initiation at multiple sites, of which 745 genes exhibited significant isoform-divergent translation. Systematic analyses of the isoform-specific translation revealed that isoforms with longer 5'UTRs tended to translate less efficiently. Further investigation of cis-elements within 5'UTRs not only provided novel insights into the regulation by known sequence features, but also led to the discovery of novel regulatory sequence motifs. Quantitative models integrating all these features explained over half of the variance in the observed isoform-divergent translation. Overall, our study demonstrated the extensive translational regulation by usage of alternative transcription start sites and offered comprehensive understanding of translational regulation by diverse sequence features embedded in 5'UTRs

    A Novel Mutation of the NARROW LEAF 1 Gene Adversely Affects Plant Architecture in Rice (Oryza sativa L.)

    Get PDF
    Plant architecture is critical for enhancing the adaptability and productivity of crop plants. Mutants with an altered plant architecture allow researchers to elucidate the genetic network and the underlying mechanisms. In this study, we characterized a novel nal1 rice mutant with short height, small panicle, and narrow and thick deep green leaves that was identified from a cross between a rice cultivar and a weedy rice accession. Bulked segregant analysis coupled with genome re-sequencing and cosegregation analysis revealed that the overall mutant phenotype was caused by a 1395-bp deletion spanning over the last two exons including the transcriptional end site of the nal1 gene. This deletion resulted in chimeric transcripts involving nal1 and the adjacent gene, which were validated by a reference-guided assembly of transcripts followed by PCR amplification. A comparative transcriptome analysis of the mutant and the wild-type rice revealed 263 differentially expressed genes involved in cell division, cell expansion, photosynthesis, reproduction, and gibberellin (GA) and brassinosteroids (BR) signaling pathways, suggesting the important regulatory role of nal1. Our study indicated that nal1 controls plant architecture through the regulation of genes involved in the photosynthetic apparatus, cell cycle, and GA and BR signaling pathways

    Genomic characterization of Italian and European pig populations

    Get PDF
    Thanks to the genomic revolution we can today take advantage of molecular and bioinformatic tools for dissecting phenotypic traits and genetic differences among modern commercial pig breeds, local populations and wild boars. This is important for European autochthonous and endangered pig breeds, whose genetic architecture remains uncharacterized and breeding potential unexploited. This project aimed to investigate genomic features of autochthonous pig breeds focusing on candidate gene markers associated to disease resistance, coat colour, vertebral number and genes involved in feeding preferences. First of all, we used a genotyping approach to define the distribution of disease resistance marker alleles in Italian local pig populations, confirming the robustness of local pig breeds. Results derived from the association study between investigated disease resistance markers and production traits, suggested that it could be possible to introduce disease resistance traits in pig breeding programs without affecting productivity. Regarding the relationship between local pig populations and wild boars, we performed an analysis monitoring the allelic distribution at two evolutionary important loci, involved in coat colour and vertebral number determination. Results suggested that Sus scrofa genome is experiencing bidirectional introgression of wild and domestic alleles, with autochthonous breeds undergoing a “de-domestication” process and wild resources challenged by a “domestication” drift. In the last part of this project we evaluated the genetic variability of taste receptor genes across European pig populations. We performed a SNP discovery study to find out similarities and differences in taste sensing system among local breeds. Taste perception is connected to the diet and the environment and comparing differences between pig breeds in these genes allows to reconstruct the history of breeds and the impact of ecology in their biodiversity. Our results can be considered a basis for the use of genetic variability among local pig populations and for further studies regarding their characterization

    Poly(A) Tail Regulation in the Nucleus

    Get PDF
    Der Ribonukleinsäure (RNS) Stoffwechsel umfasst verschiedene Schritte, beginnend mit der Transkription der RNS über die Translation bis zum RNA Abbau. Poly(A) Schwänze befinden sich am Ende der meisten der Boten-RNS, schützen die RNA vor Abbau und stimulieren Translation. Die Deadenylierung von Poly(A) Schwänzen limitiert den Abbau von RNS. Bisher wurde RNS Abbau meist im Kontext von cytoplasmatischen Prozessen untersucht, ob und wie RNS Deadenylierung und Abbau in Nukleus erfolgen ist bisher unklar. Es wurde daher eine neue Methode zur genomweiten Bestimmung von Poly(A) Schwanzlänge entwickelt, welche FLAM-Seq genannt wurde. FLAM-Seq wurde verwendet um Zelllinien, Organoide und C. elegans RNS zu analysieren und es wurde eine signifikante Korrelation zwischen 3’-UTR und Poly(A) Länge gefunden, sowie für viele Gene ein Zusammenhang von alternativen 3‘-UTR Isoformen und Poly(A) Länge. Die Untersuchung von Poly(A) Schwänzen von nicht-gespleißten RNS Molekülen zeige, dass deren Poly(A) Schwänze eine Länge von mehr als 200 nt hatten. Die Analyse wurde durch eine Inhibition des Spleiß-Prozesses validiert. Die Verwendung von Methoden zur Markierung von RNS, welche die zeitliche Auflösung der RNS Prozessierung ermöglicht, deutete auf eine Deadenylierung der Poly(A) Schwänze schon wenige Minuten nach deren Synthesis hin. Die Analyse von subzellulären Fraktionen zeigte, dass diese initiale Deadenylierung ein Prozess im Nukleus ist. Dieser Prozess ist gen-spezifisch und Poly(A) Schwänze von bestimmten Typen von Transkripten, wie nuklearen langen nicht-kodierende RNS Molekülen waren nicht deadenyliert. Um Enzyme zu identifizieren, welche die Deadenylierung im Zellkern katalysieren, wurden verschiedene Methoden wie RNS-abbauende Cas Systeme, siRNAs oder shRNA Zelllinien verwendet. Trotz einer effizienten Reduktion der RNS Expression entsprechender Enzymkomplexe konnten keine molekularen Phänotypen identifiziert werden welche die Poly(A) Länge im Zellkern beeinflussen.The RNA metabolism involves different steps from transcription to translation and decay of messenger RNAs (mRNAs). Most mRNAs have a poly(A) tail attached to their 3’-end, which protects them from degradation and stimulates translation. Removal of the poly(A) tail is the rate-limiting step in RNA decay controlling stability and translation. It is yet unclear if and to what extent RNA deadenylation occurs in the mammalian nucleus. A novel method for genome-wide determination of poly(A) tail length, termed FLAM-Seq, was developed to overcome current challenges in sequencing mRNAs, enabling genome-wide analysis of complete RNAs, including their poly(A) tail sequence. FLAM-Seq analysis of different model systems uncovered a strong correlation between poly(A) tail and 3’-UTR length or alternative polyadenylation. Cytosine nucleotides were further significantly enriched in poly(A) tails. Analyzing poly(A) tails of unspliced RNAs from FLAM-Seq data revealed the genome-wide synthesis of poly(A) tails with a length of more than 200 nt. This could be validated by splicing inhibition experiments which uncovered potential links between the completion of splicing and poly(A) tail shortening. Measuring RNA deadenylation kinetics using metabolic labeling experiments hinted at a rapid shortening of tails within minutes. The analysis of subcellular fractions obtained from HeLa cells and a mouse brain showed that initial deadenylation is a nuclear process. Nuclear deadenylation is gene specific and poly(A) tails of lncRNAs retained in the nucleus were not shortened. To identify enzymes responsible for nuclear deadenylation, RNA targeting Cas-systems, siRNAs and shRNA cell lines were used to different deadenylase complexes. Despite efficient mRNA knockdown, subcellular analysis of poly(A) tail length by did not yield molecular phenotypes of changing nuclear poly(A) tail length
    corecore