10 research outputs found

    Identification and large-scale analysis of retrogenes in animal genomes

    Get PDF
    Wydział BiologiiW mojej pracy doktorskiej skupiłem się głównie na wielkoskalowej identyfikacji retrokopii w genomach zwierzęcych i analizie ich potencjalnej funkcjonalności, jak również na zdarzeniach ewolucyjnych wpływających na repertuar retrogenów obecnych w ludzkich genomach. W pierwszej z przedstawionych publikacji opisuję RetrogeneDB, nową bazę danych zawierającą adnotacje retrokopii. W przeciwieństwie do poprzednich tego typu baz danych, takich jak HOPPSIGEN czy RCPedia, RetrogeneDB nie ogranicza się jedynie do wybranych organizmów modelowych i zawiera przewidywania retrokopii dla 62 genomów zwierzęcych pobranych z bazy danych Ensembl (wydanie 73). Baza RetrogeneDB jest obecnie dostępna pod adresem http://retrogenedb.amu.edu.pl i zawiera łącznie 84 808 przewidzianych retrokopii, z czego 64 225 nie jest obecnych w bazie Ensembl. Druga z publikacji stanowiących podstawę mojej pracy doktorskiej opisuje z kolei analizę międzypopulacyjnych różnic dotyczących repertuaru retrokopii w genomie człowieka i ich ekspresji. W odróżnieniu od wcześniejszych analiz, skoncentrowanych głównie na identyfikacji nowych zjawisk retropozycji, moim głównym celem było wykrycie utraty ancestralnych, funkcjonalnych retrokopii (retrogenów) w różnych ludzkich populacjach.In my PhD thesis I focused on the large–scale identification of retrocopies in animal genomes and the analysis of their potential functionality, as well as on the evolutionary events affecting retrocopy repertoire in human genomes. In the first publication I present RetrogeneDB, a new database containing retrocopy annotations. Unlike previous similar databases, such as HOPPSIGEN or RCPedia, RetrogeneDB is not limited to model organisms and contains retrocopy predictions for 62 animal genomes downloaded from Ensembl 73 databases. RetrogeneDB is currently located at http://retrogenedb.amu.edu.pl and overall contains 84 808 retrocopies, 64 225 of which are not annotated in the Ensembl databases. The second publication forming the basis of my PhD thesis describes the analysis of inter–population differences in retrocopy repertoire and expression. In contrast to previous studies, mostly focused on detection of novel retroposition events, my primal goal was to detect the loss of ancestral, functional retrocopies (retrogenes) in different human populations.Niniejsza praca powstała przy finansowym udziale: 1. Narodowego Centrum Nauki (grant 2013/09/N/NZ2/01221 dla M.K.) 2. KNOW Poznańskie Konsorcjum RN

    Multiple FGF4 retrocopies recently derived within canids

    Get PDF
    Two transcribed retrocopies of the fibroblast growth factor 4 (FGF4) gene have previously been described in the domestic dog. An FGF4 retrocopy on chr18 is associated with disproportionate dwarfism, while an FGF4 retrocopy on chr12 is associated with both disproportionate dwarfism and intervertebral disc disease (IVDD). In this study, whole-genome sequencing data were queried to identify other FGF4 retrocopies that could be contributing to phenotypic diversity in canids. Additionally, dogs with surgically confirmed IVDD were assayed for novel FGF4 retrocopies. Five additional and distinct FGF4 retrocopies were identified in canids including a copy unique to red wolves (Canis rufus). The FGF4 retrocopies identified in domestic dogs were identical to domestic dog FGF4 haplotypes, which are distinct from modern wolf FGF4 haplotypes, indicating that these retrotransposition events likely occurred after domestication. The identification of multiple, full length FGF4 retrocopies with open reading frames in canids indicates that gene retrotransposition events occur much more frequently than previously thought and provide a mechanism for continued genetic and phenotypic diversity in canids

    Network analysis of pseudogene-gene relationships: from pseudogene evolution to their functional potentials

    Get PDF
    Pseudogenes are fossil relatives of genes. Pseudogenes have long been thought of as "junk DNAs", since they do not code proteins in normal tissues. Although most of the human pseudogenes do not have noticeable functions, ∼20% of them exhibit transcriptional activity. There has been evidence showing that some pseudogenes adopted functions as lncRNAs and work as regulators of gene expression. Furthermore, pseudogenes can even be "reactivated" in some conditions, such as cancer initiation. Some pseudogenes are transcribed in specific cancer types, and some are even translated into proteins as observed in several cancer cell lines. All the above have shown that pseudogenes could have functional roles or potentials in the genome. Evaluating the relationships between pseudogenes and their gene counterparts could help us reveal the evolutionary path of pseudogenes and associate pseudogenes with functional potentials. It also provides an insight into the regulatory networks involving pseudogenes with transcriptional and even translational activities.In this study, we develop a novel approach integrating graph analysis, sequence alignment and functional analysis to evaluate pseudogene-gene relationships, and apply it to human gene homologs and pseudogenes. We generated a comprehensive set of 445 pseudogene-gene (PGG) families from the original 3,281 gene families (13.56%). Of these 438 (98.4% PGG, 13.3% total) were non-trivial (containing more than one pseudogene). Each PGG family contains multiple genes and pseudogenes with high sequence similarity. For each family, we generate a sequence alignment network and phylogenetic trees recapitulating the evolutionary paths. We find evidence supporting the evolution history of olfactory family (both genes and pseudogenes) in human, which also supports the validity of our analysis method. Next, we evaluate these networks in respect to the gene ontology from which we identify functions enriched in these pseudogene-gene families and infer functional impact of pseudogenes involved in the networks. This demonstrates the application of our PGG network database in the study of pseudogene function in disease context

    mRNA Vaccines: Why Is the Biology of Retroposition Ignored?

    Get PDF
    The major advantage of mRNA vaccines over more conventional approaches is their potential for rapid development and large-scale deployment in pandemic situations. In the current COVID-19 crisis, two mRNA COVID-19 vaccines have been conditionally approved and broadly applied, while others are still in clinical trials. However, there is no previous experience with the use of mRNA vaccines on a large scale in the general population. This warrants a careful evaluation of mRNA vaccine safety properties by considering all available knowledge about mRNA molecular biology and evolution. Here, I discuss the pervasive claim that mRNA-based vaccines cannot alter genomes. Surprisingly, this notion is widely stated in the mRNA vaccine literature but never supported by referencing any primary scientific papers that would specifically address this question. This discrepancy becomes even more puzzling if one considers previous work on the molecular and evolutionary aspects of retroposition in murine and human populations that clearly documents the frequent integration of mRNA molecules into genomes, including clinical contexts. By performing basic comparisons, I show that the sequence features of mRNA vaccines meet all known requirements for retroposition using L1 elements—the most abundant autonomously active retrotransposons in the human genome. In fact, many factors associated with mRNA vaccines increase the possibility of their L1- mediated retroposition. I conclude that is unfounded to a priori assume that mRNA-based therapeutics do not impact genomes and that the route to genome integration of vaccine mRNAs via endogenous L1 retroelements is easily conceivable. This implies that we urgently need experimental studies that would rigorously test for the potential retroposition of vaccine mRNAs. At present, the insertional mutagenesis safety of mRNA-based vaccines should be considered unresolved

    Contribution of retrotransposition to developmental disorders.

    Get PDF
    Mobile genetic Elements (MEs) are segments of DNA which can copy themselves and other transcribed sequences through the process of retrotransposition (RT). In humans several disorders have been attributed to RT, but the role of RT in severe developmental disorders (DD) has not yet been explored. Here we identify RT-derived events in 9738 exome sequenced trios with DD-affected probands. We ascertain 9 de novo MEs, 4 of which are likely causative of the patient's symptoms (0.04%), as well as 2 de novo gene retroduplications. Beyond identifying likely diagnostic RT events, we estimate genome-wide germline ME mutation rate and selective constraint and demonstrate that coding RT events have signatures of purifying selection equivalent to those of truncating mutations. Overall, our analysis represents a comprehensive interrogation of the impact of retrotransposition on protein coding genes and a framework for future evolutionary and disease studies

    Landscape and variation of novel retroduplications in 26 human populations

    No full text
    <div><p>Retroduplications come from reverse transcription of mRNAs and their insertion back into the genome. Here, we performed comprehensive discovery and analysis of retroduplications in a large cohort of 2,535 individuals from 26 human populations, as part of 1000 Genomes Phase 3. We developed an integrated approach to discover novel retroduplications combining high-coverage exome and low-coverage whole-genome sequencing data, utilizing information from both exon-exon junctions and discordant paired-end reads. We found 503 parent genes having novel retroduplications absent from the reference genome. Based solely on retroduplication variation, we built phylogenetic trees of human populations; these represent superpopulation structure well and indicate that variable retroduplications are effective population markers. We further identified 43 retroduplication parent genes differentiating superpopulations. This group contains several interesting insertion events, including a SLMO2 retroduplication and insertion into CAV3, which has a potential disease association. We also found retroduplications to be associated with a variety of genomic features: (1) Insertion sites were correlated with regular nucleosome positioning. (2) They, predictably, tend to avoid conserved functional regions, such as exons, but, somewhat surprisingly, also avoid introns. (3) Retroduplications tend to be co-inserted with young L1 elements, indicating recent retrotranspositional activity, and (4) they have a weak tendency to originate from highly expressed parent genes. Our investigation provides insight into the functional impact and association with genomic elements of retroduplications. We anticipate our approach and analytical methodology to have application in a more clinical context, where exome sequencing data is abundant and the discovery of retroduplications can potentially improve the accuracy of SNP calling.</p></div
    corecore