32 research outputs found

    A comprehensive collection of annotations to interpret sequence variation in human mitochondrial transfer RNAs

    Get PDF
    Background: The abundance of biological data characterizing the genomics era is contributing to a comprehensive understanding of human mitochondrial genetics. Nevertheless, many aspects are still unclear, specifically about the variability of the 22 human mitochondrial transfer RNA (tRNA) genes and their involvement in diseases. The complex enrichment and isolation of tRNAs in vitro leads to an incomplete knowledge of their post-transcriptional modifications and three-dimensional folding, essential for correct tRNA functioning. An accurate annotation of mitochondrial tRNA variants would be definitely useful and appreciated by mitochondrial researchers and clinicians since the most of bioinformatics tools for variant annotation and prioritization available so far cannot shed light on the functional role of tRNA variations. Results: To this aim, we updated our MToolBox pipeline for mitochondrial DNA analysis of high throughput and Sanger sequencing data by integrating tRNA variant annotations in order to identify and characterize relevant variants not only in protein coding regions, but also in tRNA genes. The annotation step in the pipeline now provides detailed information for variants mapping onto the 22 mitochondrial tRNAs. For each mt-tRNA position along the entire genome, the relative tRNA numbering, tRNA type, cloverleaf secondary domains (loops and stems), mature nucleotide and interactions in the three-dimensional folding were reported. Moreover, pathogenicity predictions for tRNA and rRNA variants were retrieved from the literature and integrated within the annotations provided by MToolBox, both in the stand-alone version and web-based tool at the Mitochondrial Disease Sequence Data Resource (MSeqDR) website. All the information available in the annotation step of MToolBox were exploited to generate custom tracks which can be displayed in the GBrowse instance at MSeqDR website. Conclusions: To the best of our knowledge, specific data regarding mitochondrial variants in tRNA genes were introduced for the first time in a tool for mitochondrial genome analysis, supporting the interpretation of genetic variants in specific genomic contexts

    New Insights Into Mitochondrial DNA Reconstruction and Variant Detection in Ancient Samples.

    Get PDF
    Ancient DNA (aDNA) studies are frequently focused on the analysis of the mitochondrial DNA (mtDNA), which is much more abundant than the nuclear genome, hence can be better retrieved from ancient remains. However, postmortem DNA damage and contamination make the data analysis difficult because of DNA fragmentation and nucleotide alterations. In this regard, the assessment of the heteroplasmic fraction in ancient mtDNA has always been considered an unachievable goal due to the complexity in distinguishing true endogenous variants from artifacts. We implemented and applied a computational pipeline for mtDNA analysis to a dataset of 30 ancient human samples from an Iron Age necropolis in Polizzello (Sicily, Italy). The pipeline includes several modules from well-established tools for aDNA analysis and a recently released variant caller, which was specifically conceived for mtDNA, applied for the first time to aDNA data. Through a fine-tuned filtering on variant allele sequencing features, we were able to accurately reconstruct nearly complete (>88%) mtDNA genome for almost all the analyzed samples (27 out of 30), depending on the degree of preservation and the sequencing throughput, and to get a reliable set of variants allowing haplogroup prediction. Additionally, we provide guidelines to deal with possible artifact sources, including nuclear mitochondrial sequence (NumtS) contamination, an often-neglected issue in ancient mtDNA surveys. Potential heteroplasmy levels were also estimated, although most variants were likely homoplasmic, and validated by data simulations, proving that new sequencing technologies and software are sensitive enough to detect partially mutated sites in ancient genomes and discriminate true variants from artifacts. A thorough functional annotation of detected and filtered mtDNA variants was also performed for a comprehensive evaluation of these ancient samples

    Life history and ancestry of the late Upper Palaeolithic infant from Grotta delle Mura, Italy

    Get PDF
    The biological aspects of infancy within late Upper Palaeolithic populations and the role of southern refugia at the end of the Last Glacial Maximum are not yet fully understood. This study presents a multidisciplinary, high temporal resolution investigation of an Upper Palaeolithic infant from Grotta delle Mura (Apulia, southern Italy) combining palaeogenomics, dental palaeohistology, spatially-resolved geochemical analyses, direct radiocarbon dating, and traditional anthropological studies. The skeletal remains of the infant – Le Mura 1 – were directly dated to 17,320-16,910 cal BP. The results portray a biological history of the infant’s development, early life, health and death (estimated at ~72 weeks). They identify, several phenotypic traits and a potential congenital disease in the infant, the mother’s low mobility during gestation, and a high level of endogamy. Furthermore, the genomic data indicates an early spread of the Villabruna-like components along the Italian peninsula, confirming a population turnover around the time of the Last Glacial Maximum, and highlighting a general reduction in genetic variability from northern to southern Italy. Overall, Le Mura 1 contributes to our better understanding of the early stages of life and the genetic puzzle in the Italian peninsula at the end of the Last Glacial Maximum. © The Author(s) 2024

    Building a Portuguese Coalition for Biodiversity Genomics

    Get PDF
    The diverse physiography of the Portuguese land and marine territory, spanning from continental Europe to the Atlantic archipelagos, has made it an important repository of biodiversity throughout the Pleistocene glacial cycles, leading to a remarkable diversity of species and ecosystems. This rich biodiversity is under threat from anthropogenic drivers, such as climate change, invasive species, land use changes, overexploitation or pathogen (re)emergence. The inventory, characterization and study of biodiversity at inter- and intra-specific levels using genomics is crucial to promote its preservation and recovery by informing biodiversity conservation policies, management measures and research. The participation of researchers from Portuguese institutions in the European Reference Genome Atlas (ERGA) initiative, and its pilot effort to generate reference genomes for European biodiversity, has reinforced the establishment of Biogenome Portugal. This nascent institutional network will connect the national community of researchers in genomics. Here, we describe the Portuguese contribution to ERGA’s pilot effort, which will generate high-quality reference genomes of six species from Portugal that are endemic, iconic and/or endangered, and include plants, insects and vertebrates (fish, birds and mammals) from mainland Portugal or the Azores islands. In addition, we outline the objectives of Biogenome Portugal, which aims to (i) promote scientific collaboration, (ii) contribute to advanced training, (iii) stimulate the participation of institutions and researchers based in Portugal in international biodiversity genomics initiatives, and (iv) contribute to the transfer of knowledge to stakeholders and engaging the public to preserve biodiversity. This initiative will strengthen biodiversity genomics research in Portugal and fuel the genomic inventory of Portuguese eukaryotic species. Such efforts will be critical to the conservation of the country’s rich biodiversity and will contribute to ERGA’s goal of generating reference genomes for European species.info:eu-repo/semantics/publishedVersio

    Building a Portuguese coalition for biodiversity genomics

    Get PDF
    The diverse physiography of the Portuguese land and marine territory, spanning from continental Europe to the Atlantic archipelagos, has made it an important repository of biodiversity throughout the Pleistocene glacial cycles, leading to a remarkable diversity of species and ecosystems. This rich biodiversity is under threat from anthropogenic drivers, such as climate change, invasive species, land use changes, overexploitation, or pathogen (re)emergence. The inventory, characterisation, and study of biodiversity at inter- and intra-specific levels using genomics is crucial to promote its preservation and recovery by informing biodiversity conservation policies, management measures, and research. The participation of researchers from Portuguese institutions in the European Reference Genome Atlas (ERGA) initiative and its pilot effort to generate reference genomes for European biodiversity has reinforced the establishment of Biogenome Portugal. This nascent institutional network will connect the national community of researchers in genomics. Here, we describe the Portuguese contribution to ERGA’s pilot effort, which will generate high-quality reference genomes of six species from Portugal that are endemic, iconic, and/or endangered and include plants, insects, and vertebrates (fish, birds, and mammals) from mainland Portugal or the Azores islands. In addition, we outline the objectives of Biogenome Portugal, which aims to (i) promote scientific collaboration, (ii) contribute to advanced training, (iii) stimulate the participation of institutions and researchers based in Portugal in international biodiversity genomics initiatives, and (iv) contribute to the transfer of knowledge to stakeholders and engaging the public to preserve biodiversity. This initiative will strengthen biodiversity genomics research in Portugal and fuel the genomic inventory of Portuguese eukaryotic species. Such efforts will be critical to the conservation of the country’s rich biodiversity and will contribute to ERGA’s goal of generating reference genomes for European species.info:eu-repo/semantics/publishedVersio

    The European Reference Genome Atlas: piloting a decentralised approach to equitable biodiversity genomics

    Get PDF
    A genomic database of all Earth’s eukaryotic species could contribute to many scientific discoveries; however, only a tiny fraction of species have genomic information available. In 2018, scientists across the world united under the Earth BioGenome Project (EBP), aiming to produce a database of high-quality reference genomes containing all ~1.5 million recognized eukaryotic species. As the European node of the EBP, the European Reference Genome Atlas (ERGA) sought to implement a new decentralised, equitable and inclusive model for producing reference genomes. For this, ERGA launched a Pilot Project establishing the first distributed reference genome production infrastructure and testing it on 98 eukaryotic species from 33 European countries. Here we outline the infrastructure and explore its effectiveness for scaling high-quality reference genome production, whilst considering equity and inclusion. The outcomes and lessons learned provide a solid foundation for ERGA while offering key learnings to other transnational, national genomic resource projects and the EBP.info:eu-repo/semantics/publishedVersio

    The European Reference Genome Atlas: piloting a decentralised approach to equitable biodiversity genomics.

    Get PDF
    ABSTRACT: A global genome database of all of Earth’s species diversity could be a treasure trove of scientific discoveries. However, regardless of the major advances in genome sequencing technologies, only a tiny fraction of species have genomic information available. To contribute to a more complete planetary genomic database, scientists and institutions across the world have united under the Earth BioGenome Project (EBP), which plans to sequence and assemble high-quality reference genomes for all ∼1.5 million recognized eukaryotic species through a stepwise phased approach. As the initiative transitions into Phase II, where 150,000 species are to be sequenced in just four years, worldwide participation in the project will be fundamental to success. As the European node of the EBP, the European Reference Genome Atlas (ERGA) seeks to implement a new decentralised, accessible, equitable and inclusive model for producing high-quality reference genomes, which will inform EBP as it scales. To embark on this mission, ERGA launched a Pilot Project to establish a network across Europe to develop and test the first infrastructure of its kind for the coordinated and distributed reference genome production on 98 European eukaryotic species from sample providers across 33 European countries. Here we outline the process and challenges faced during the development of a pilot infrastructure for the production of reference genome resources, and explore the effectiveness of this approach in terms of high-quality reference genome production, considering also equity and inclusion. The outcomes and lessons learned during this pilot provide a solid foundation for ERGA while offering key learnings to other transnational and national genomic resource projects.info:eu-repo/semantics/publishedVersio

    Elucidating the editome: bioinformatics approaches for RNA editing detection

    No full text
    RNA editing is a widespread co/posttranscriptional mechanism affecting primary RNAs by specific nucleotide modifications, which plays relevant roles in molecular processes including regulation of gene expression and/or the processing of noncoding RNAs. In recent years, the detection of editing sites has been improved through the availability of high-throughput RNA sequencing (RNA-Seq) technologies. Accurate bioinformatics pipelines are essential for the analysis of next-generation sequencing (NGS) data to ensure the correct identification of edited sites. Several pipelines, using various read mappers and variant callers with a wide range of adjustable parameters, are available for the detection of RNA editing events. In this review, we discuss some of the most recent and popular tools and provide guidelines for RNA-Seq data generation and analysis for the detection of RNA editing in massive transcriptome data. Using simulated and real data sets, we provide an overview of their behavior, emphasizing the fact that the RNA editing detection in NGS data sets remains a challenging task

    Investigating Human Mitochondrial Genomes in Single Cells

    No full text
    Mitochondria host multiple copies of their own small circular genome that has been extensively studied to trace the evolution of the modern eukaryotic cell and discover important mutations linked to inherited diseases. Whole genome and exome sequencing have enabled the study of mtDNA in a large number of samples and experimental conditions at single nucleotide resolution, allowing the deciphering of the relationship between inherited mutations and phenotypes and the identification of acquired mtDNA mutations in classical mitochondrial diseases as well as in chronic disorders, ageing and cancer. By applying an ad hoc computational pipeline based on our MToolBox software, we reconstructed mtDNA genomes in single cells using whole genome and exome sequencing data obtained by different amplification methodologies (eWGA, DOP-PCR, MALBAC, MDA) as well as data from single cell Assay for Transposase Accessible Chromatin with high-throughput sequencing (scATAC-seq) in which mtDNA sequences are expected as a byproduct of the technology. We show that assembled mtDNAs, with the exception of those reconstructed by MALBAC and DOP-PCR methods, are quite uniform and suitable for genomic investigations, enabling the study of various biological processes related to cellular heterogeneity such as tumor evolution, neural somatic mosaicism and embryonic development

    A multi-parametric workflow for the prioritization of mitochondrial DNA variants of clinical interest

    No full text
    Assigning a pathogenic role to mitochondrial DNA (mtDNA) variants and unveiling the potential involvement of the mitochondrial genome in diseases are challenging tasks in human medicine. Assuming that rare variants are more likely to be damaging, we designed a phylogeny-based prioritization workflow to obtain a reliable pool of candidate variants for further investigations. The prioritization workflow relies on an exhaustive functional annotation through the mtDNA extraction pipeline MToolBox and includes Macro Haplogroup Consensus Sequences to filter out fixed evolutionary variants and report rare or private variants, the nucleotide variability as reported in HmtDB and the disease score based on several predictors of pathogenicity for non-synonymous variants. Cutoffs for both the disease score as well as for the nucleotide variability index were established with the aim to discriminate sequence variants contributing to defective phenotypes. The workflow was validated on mitochondrial sequences from Leber's Hereditary Optic Neuropathy affected individuals, successfully identifying 23 variants including the majority of the known causative ones. The application of the prioritization workflow to cancer datasets allowed to trim down the number of candidate for subsequent functional analyses, unveiling among these a high percentage of somatic variants. Prioritization criteria were implemented in both standalone ( http://sourceforge.net/projects/mtoolbox/ ) and web version ( https://mseqdr.org/mtoolbox.php ) of MToolBox
    corecore