23 research outputs found

    Reconstruction of full-length 16S rRNA sequences for taxonomic assignment inmetagenomics

    Get PDF
    National audienceAdvances in the sequencing of uncultured environmental samples, raise a growing need for accurate taxonomic assignment. Accurate identification of organisms present within a community is essential to understanding even the most elementary ecosystems. However, current high-throughput sequencing technologies generate short reads which partially cover full-length marker genes and this poses difficult bioinformatic challenges for taxonomy identification at high resolution. We designed MATAM, a software dedicated to the fast and accurate targeted assembly of short reads sequenced from a genomic marker of interest. The method implements a stepwise process based on construction and analysis of a read overlap graph. It is applied to the assembly of 16S rRNA markers and is validated on simulated, synthetic and genuine metagenomes. We show that MATAM outperforms other available methods in terms of low error rates and recovered genome fractions and is suitable to provide improved assemblies for precise taxonomic assignments

    New insights into the genetic etiology of Alzheimer's disease and related dementias

    Get PDF
    Characterization of the genetic landscape of Alzheimer's disease (AD) and related dementias (ADD) provides a unique opportunity for a better understanding of the associated pathophysiological processes. We performed a two-stage genome-wide association study totaling 111,326 clinically diagnosed/'proxy' AD cases and 677,663 controls. We found 75 risk loci, of which 42 were new at the time of analysis. Pathway enrichment analyses confirmed the involvement of amyloid/tau pathways and highlighted microglia implication. Gene prioritization in the new loci identified 31 genes that were suggestive of new genetically associated processes, including the tumor necrosis factor alpha pathway through the linear ubiquitin chain assembly complex. We also built a new genetic risk score associated with the risk of future AD/dementia or progression from mild cognitive impairment to AD/dementia. The improvement in prediction led to a 1.6- to 1.9-fold increase in AD risk from the lowest to the highest decile, in addition to effects of age and the APOE Δ4 allele

    Multiancestry analysis of the HLA locus in Alzheimer’s and Parkinson’s diseases uncovers a shared adaptive immune response mediated by HLA-DRB1*04 subtypes

    Get PDF
    Across multiancestry groups, we analyzed Human Leukocyte Antigen (HLA) associations in over 176,000 individuals with Parkinson’s disease (PD) and Alzheimer’s disease (AD) versus controls. We demonstrate that the two diseases share the same protective association at the HLA locus. HLA-specific fine-mapping showed that hierarchical protective effects of HLA-DRB1*04 subtypes best accounted for the association, strongest with HLA-DRB1*04:04 and HLA-DRB1*04:07, and intermediary with HLA-DRB1*04:01 and HLA-DRB1*04:03. The same signal was associated with decreased neurofibrillary tangles in postmortem brains and was associated with reduced tau levels in cerebrospinal fluid and to a lower extent with increased AÎČ42. Protective HLA-DRB1*04 subtypes strongly bound the aggregation-prone tau PHF6 sequence, however only when acetylated at a lysine (K311), a common posttranslational modification central to tau aggregation. An HLA-DRB1*04-mediated adaptive immune response decreases PD and AD risks, potentially by acting against tau, offering the possibility of therapeutic avenues

    Algorithms for conserved markers sequences reconstruction in metagenomics data

    No full text
    Les progrĂšs rĂ©cents en termes de sĂ©quençage d’ADN permettent maintenant d’accĂ©der au matĂ©riel gĂ©nĂ©tique de communautĂ©s microbiennes extraites directement d’échantillons environnementaux naturels. Ce nouveau domaine de recherche, appelĂ© mĂ©tagĂ©nomique, a de nombreuses applications en santĂ©, en agro-alimentaire, en Ă©cologie, par exemple. Analyser de tels Ă©chantillons demande toutefois de dĂ©velopper de nouvelles mĂ©thodes bio-informatiques pour dĂ©terminer la composition taxonomique de la communautĂ© Ă©tudiĂ©e. L’identification prĂ©cise des organismes prĂ©sents est en effet une Ă©tape essentielle Ă  la comprĂ©hension des Ă©cosystĂšmes mĂȘme les plus simples. Cependant, les technologies de sĂ©quençage actuelles produisent des fragments d’ADN courts et bruitĂ©s, qui ne couvrent que partiellement les sĂ©quences complĂštes des gĂšnes, ce qui pose un vĂ©ritable dĂ©fi pour l’analyse taxonomique Ă  haute rĂ©solution. Nous avons dĂ©veloppĂ© MATAM, une nouvelle mĂ©thode bio-informatique dĂ©diĂ©e Ă  la reconstruction rapide et sans erreurs de sĂ©quences complĂštes de marqueurs phylogĂ©nĂ©tiques conservĂ©s, Ă  partir de donnĂ©es brutes de sĂ©quençage. Cette mĂ©thode est composĂ©e d’une succession d’étapes qui rĂ©alisent la construction et l’analyse d’un graphe de chevauchement de lectures. Nous l’avons appliquĂ©e Ă  l’assemblage de la petite sous-unitĂ© de l’ARN ribosomique sur des mĂ©tagĂ©nomes simulĂ©s, synthĂ©tiques et rĂ©els. Les rĂ©sultats obtenus sont de trĂšs bonne qualitĂ© et amĂ©liorent l’état de l’art.Recent advances in DNA sequencing now allow studying the genetic material from microbial communities extracted from natural environmental samples. This new research field, called metagenomics, is leading innovation in many areas such as human health, agriculture, and ecology. To analyse such samples, new bioinformatics methods are still needed to ascertain the studied community taxonomic composition because accurate organisms identification is a necessary step to understand even the simplest ecosystems. However, current sequencing technologies are generating short and noisy DNA fragments, which only partially cover the complete genes sequences, giving rise to a major challenge for high resolution taxonomic analysis. We developped MATAM, a new bioinformatic methods dedicated to fast reconstruction of low-error complete sequences from conserved phylogenetic markers, starting from raw sequencing data. This methods is a multi-step process that builds and analyses a read overlap graph. We applied MATAM to the reconstruction of the small sub unit ribosomal ARN in simulated, synthetic and genuine metagenomes. We obtained high quality results, improving the state of the art

    Algorithmes pour la reconstruction de séquences de marqueurs conservés dans des données de métagénomique

    No full text
    Recent advances in DNA sequencing now allow studying the genetic material from microbial communities extracted from natural environmental samples. This new research field, called metagenomics, is leading innovation in many areas such as human health, agriculture, and ecology. To analyse such samples, new bioinformatics methods are still needed to ascertain the studied community taxonomic composition because accurate organisms identification is a necessary step to understand even the simplest ecosystems. However, current sequencing technologies are generating short and noisy DNA fragments, which only partially cover the complete genes sequences, giving rise to a major challenge for high resolution taxonomic analysis. We developped MATAM, a new bioinformatic methods dedicated to fast reconstruction of low-error complete sequences from conserved phylogenetic markers, starting from raw sequencing data. This methods is a multi-step process that builds and analyses a read overlap graph. We applied MATAM to the reconstruction of the small sub unit ribosomal ARN in simulated, synthetic and genuine metagenomes. We obtained high quality results, improving the state of the art.Les progrĂšs rĂ©cents en termes de sĂ©quençage d’ADN permettent maintenant d’accĂ©der au matĂ©riel gĂ©nĂ©tique de communautĂ©s microbiennes extraites directement d’échantillons environnementaux naturels. Ce nouveau domaine de recherche, appelĂ© mĂ©tagĂ©nomique, a de nombreuses applications en santĂ©, en agro-alimentaire, en Ă©cologie, par exemple. Analyser de tels Ă©chantillons demande toutefois de dĂ©velopper de nouvelles mĂ©thodes bio-informatiques pour dĂ©terminer la composition taxonomique de la communautĂ© Ă©tudiĂ©e. L’identification prĂ©cise des organismes prĂ©sents est en effet une Ă©tape essentielle Ă  la comprĂ©hension des Ă©cosystĂšmes mĂȘme les plus simples. Cependant, les technologies de sĂ©quençage actuelles produisent des fragments d’ADN courts et bruitĂ©s, qui ne couvrent que partiellement les sĂ©quences complĂštes des gĂšnes, ce qui pose un vĂ©ritable dĂ©fi pour l’analyse taxonomique Ă  haute rĂ©solution. Nous avons dĂ©veloppĂ© MATAM, une nouvelle mĂ©thode bio-informatique dĂ©diĂ©e Ă  la reconstruction rapide et sans erreurs de sĂ©quences complĂštes de marqueurs phylogĂ©nĂ©tiques conservĂ©s, Ă  partir de donnĂ©es brutes de sĂ©quençage. Cette mĂ©thode est composĂ©e d’une succession d’étapes qui rĂ©alisent la construction et l’analyse d’un graphe de chevauchement de lectures. Nous l’avons appliquĂ©e Ă  l’assemblage de la petite sous-unitĂ© de l’ARN ribosomique sur des mĂ©tagĂ©nomes simulĂ©s, synthĂ©tiques et rĂ©els. Les rĂ©sultats obtenus sont de trĂšs bonne qualitĂ© et amĂ©liorent l’état de l’art

    Algorithmes pour la reconstruction de séquences de marqueurs conservés dans des données de métagénomique

    No full text
    Recent advances in DNA sequencing now allow studying the genetic material from microbial communities extracted from natural environmental samples. This new research field, called metagenomics, is leading innovation in many areas such as human health, agriculture, and ecology. To analyse such samples, new bioinformatics methods are still needed to ascertain the studied community taxonomic composition because accurate organisms identification is a necessary step to understand even the simplest ecosystems. However, current sequencing technologies are generating short and noisy DNA fragments, which only partially cover the complete genes sequences, giving rise to a major challenge for high resolution taxonomic analysis. We developped MATAM, a new bioinformatic methods dedicated to fast reconstruction of low-error complete sequences from conserved phylogenetic markers, starting from raw sequencing data. This methods is a multi-step process that builds and analyses a read overlap graph. We applied MATAM to the reconstruction of the small sub unit ribosomal ARN in simulated, synthetic and genuine metagenomes. We obtained high quality results, improving the state of the art.Les progrĂšs rĂ©cents en termes de sĂ©quençage d’ADN permettent maintenant d’accĂ©der au matĂ©riel gĂ©nĂ©tique de communautĂ©s microbiennes extraites directement d’échantillons environnementaux naturels. Ce nouveau domaine de recherche, appelĂ© mĂ©tagĂ©nomique, a de nombreuses applications en santĂ©, en agro-alimentaire, en Ă©cologie, par exemple. Analyser de tels Ă©chantillons demande toutefois de dĂ©velopper de nouvelles mĂ©thodes bio-informatiques pour dĂ©terminer la composition taxonomique de la communautĂ© Ă©tudiĂ©e. L’identification prĂ©cise des organismes prĂ©sents est en effet une Ă©tape essentielle Ă  la comprĂ©hension des Ă©cosystĂšmes mĂȘme les plus simples. Cependant, les technologies de sĂ©quençage actuelles produisent des fragments d’ADN courts et bruitĂ©s, qui ne couvrent que partiellement les sĂ©quences complĂštes des gĂšnes, ce qui pose un vĂ©ritable dĂ©fi pour l’analyse taxonomique Ă  haute rĂ©solution. Nous avons dĂ©veloppĂ© MATAM, une nouvelle mĂ©thode bio-informatique dĂ©diĂ©e Ă  la reconstruction rapide et sans erreurs de sĂ©quences complĂštes de marqueurs phylogĂ©nĂ©tiques conservĂ©s, Ă  partir de donnĂ©es brutes de sĂ©quençage. Cette mĂ©thode est composĂ©e d’une succession d’étapes qui rĂ©alisent la construction et l’analyse d’un graphe de chevauchement de lectures. Nous l’avons appliquĂ©e Ă  l’assemblage de la petite sous-unitĂ© de l’ARN ribosomique sur des mĂ©tagĂ©nomes simulĂ©s, synthĂ©tiques et rĂ©els. Les rĂ©sultats obtenus sont de trĂšs bonne qualitĂ© et amĂ©liorent l’état de l’art

    Algorithmes pour la reconstruction de séquences de marqueurs conservés dans des données de métagénomique

    No full text
    Recent advances in DNA sequencing now allow studying the genetic material from microbial communities extracted from natural environmental samples. This new research field, called metagenomics, is leading innovation in many areas such as human health, agriculture, and ecology. To analyse such samples, new bioinformatics methods are still needed to ascertain the studied community taxonomic composition because accurate organisms identification is a necessary step to understand even the simplest ecosystems. However, current sequencing technologies are generating short and noisy DNA fragments, which only partially cover the complete genes sequences, giving rise to a major challenge for high resolution taxonomic analysis. We developped MATAM, a new bioinformatic methods dedicated to fast reconstruction of low-error complete sequences from conserved phylogenetic markers, starting from raw sequencing data. This methods is a multi-step process that builds and analyses a read overlap graph. We applied MATAM to the reconstruction of the small sub unit ribosomal ARN in simulated, synthetic and genuine metagenomes. We obtained high quality results, improving the state of the art.Les progrĂšs rĂ©cents en termes de sĂ©quençage d’ADN permettent maintenant d’accĂ©der au matĂ©riel gĂ©nĂ©tique de communautĂ©s microbiennes extraites directement d’échantillons environnementaux naturels. Ce nouveau domaine de recherche, appelĂ© mĂ©tagĂ©nomique, a de nombreuses applications en santĂ©, en agro-alimentaire, en Ă©cologie, par exemple. Analyser de tels Ă©chantillons demande toutefois de dĂ©velopper de nouvelles mĂ©thodes bio-informatiques pour dĂ©terminer la composition taxonomique de la communautĂ© Ă©tudiĂ©e. L’identification prĂ©cise des organismes prĂ©sents est en effet une Ă©tape essentielle Ă  la comprĂ©hension des Ă©cosystĂšmes mĂȘme les plus simples. Cependant, les technologies de sĂ©quençage actuelles produisent des fragments d’ADN courts et bruitĂ©s, qui ne couvrent que partiellement les sĂ©quences complĂštes des gĂšnes, ce qui pose un vĂ©ritable dĂ©fi pour l’analyse taxonomique Ă  haute rĂ©solution. Nous avons dĂ©veloppĂ© MATAM, une nouvelle mĂ©thode bio-informatique dĂ©diĂ©e Ă  la reconstruction rapide et sans erreurs de sĂ©quences complĂštes de marqueurs phylogĂ©nĂ©tiques conservĂ©s, Ă  partir de donnĂ©es brutes de sĂ©quençage. Cette mĂ©thode est composĂ©e d’une succession d’étapes qui rĂ©alisent la construction et l’analyse d’un graphe de chevauchement de lectures. Nous l’avons appliquĂ©e Ă  l’assemblage de la petite sous-unitĂ© de l’ARN ribosomique sur des mĂ©tagĂ©nomes simulĂ©s, synthĂ©tiques et rĂ©els. Les rĂ©sultats obtenus sont de trĂšs bonne qualitĂ© et amĂ©liorent l’état de l’art

    Reconstruction of full-length 16S rRNA sequences for taxonomic assignment inmetagenomics

    No full text
    National audienceAdvances in the sequencing of uncultured environmental samples, raise a growing need for accurate taxonomic assignment. Accurate identification of organisms present within a community is essential to understanding even the most elementary ecosystems. However, current high-throughput sequencing technologies generate short reads which partially cover full-length marker genes and this poses difficult bioinformatic challenges for taxonomy identification at high resolution. We designed MATAM, a software dedicated to the fast and accurate targeted assembly of short reads sequenced from a genomic marker of interest. The method implements a stepwise process based on construction and analysis of a read overlap graph. It is applied to the assembly of 16S rRNA markers and is validated on simulated, synthetic and genuine metagenomes. We show that MATAM outperforms other available methods in terms of low error rates and recovered genome fractions and is suitable to provide improved assemblies for precise taxonomic assignments

    MATAM: reconstruction of phylogenetic marker genes from short sequencing reads in metagenomes

    Get PDF
    International audienceMotivation: Advances in the sequencing of uncultured environmental samples, dubbed metagenomics, raise a growing need for accurate taxonomic assignment. Accurate identification of organisms present within a community is essential to understanding even the most elementary ecosystems. However, current high-throughput sequencing technologies generate short reads which partially cover full-length marker genes and this poses difficult bioinformatic challenges for taxonomy identification at high resolution. Results: We designed MATAM, a software dedicated to the fast and accurate targeted assembly of short reads sequenced from a genomic marker of interest. The method implements a stepwise process based on construction and analysis of a read overlap graph. It is applied to the assembly of 16S rRNA markers and is validated on simulated, synthetic and genuine metagenomes. We show that MATAM outperforms other available methods in terms of low error rates and recovered fractions and is suitable to provide improved assemblies for precise taxonomic assignments

    SortMeRNA 2: ribosomal RNA classification for taxonomic assignation

    No full text
    International audienceWe have developed SortMeRNA, a software designed to filter ribosomal reads from metatranscriptomic or metagenomic data. It is capable of handling large data sets and sorting out all fragments matching to a database of annotated ribosomal RNA sequences from the three-domain system with high sensitivity and a low running time. We propose a new version, SortMeRNA2, with extended functionalities for improved data analysis. Most importantly, it can now perform sequence alignments to any ribosomal RNA database, which allows the user to study the taxonomic content of a microbial sample. For that, we have developed an alignment strategy based on approximate seeds and seed extension using a variant of the Longest Increasing Subsequence. SortMeRNA2 also applies statistical analysis to evaluate the significance of an alignment, based on the E-value, which confers a great accuracy to the program
    corecore