10 research outputs found

    Analytical “Bake-Off” of Whole Genome Sequencing Quality for the Genome Russia Project Using a Small Cohort for Autoimmune Hepatitis

    Get PDF
    A comparative analysis of whole genome sequencing (WGS) and genotype calling was initiated for ten human genome samples sequenced by St. Petersburg State University Peterhof Sequencing Center and by three commercial sequencing centers outside of Russia. The sequence quality, efficiency of DNA variant and genotype calling were compared with each other and with DNA microarrays for each of ten study subjects. We assessed calling of SNPs, indels, copy number variation, and the speed of WGS throughput promised. Twenty separate QC analyses showed high similarities among the sequence quality and called genotypes. The ten genomes tested by the centers included eight American patients afflicted with autoimmune hepatitis (AIH), plus one case’s unaffected parents, in a prelude to discovering genetic influences in this rare disease of unknown etiology. The detailed internal replication and parallel analyses allowed the observation of two of eight AIH cases carrying a rare allele genotype for a previously described AIH-associated gene (FTCD), plus multiple occurrences of known HLA-DRB1 alleles associated with AIH (HLA-DRB1-03:01:01, 13:01:01 and 7:01:01). We also list putative SNVs in other genes as suggestive in AIH influence

    Genome-Wide Mycobacterium tuberculosis Variation (GMTV) Database: A New Tool for Integrating Sequence Variations and Epidemiology

    Get PDF
    Background Tuberculosis (TB) poses a worldwide threat due to advancing multidrug-resistant strains and deadly co-infections with Human immunodeficiency virus. Today large amounts of Mycobacterium tuberculosis whole genome sequencing data are being assessed broadly and yet there exists no comprehensive online resource that connects M. tuberculosis genome variants with geographic origin, with drug resistance or with clinical outcome. Description Here we describe a broadly inclusive unifying Genome-wide Mycobacterium tuberculosis Variation (GMTV) database, (http://mtb.dobzhanskycenter.org) that catalogues genome variations of M. tuberculosis strains collected across Russia. GMTV contains a broad spectrum of data derived from different sources and related to M. tuberculosis molecular biology, epidemiology, TB clinical outcome, year and place of isolation, drug resistance profiles and displays the variants across the genome using a dedicated genome browser. GMTV database, which includes 1084 genomes and over 69,000 SNP or Indel variants, can be queried about M. tuberculosis genome variation and putative associations with drug resistance, geographical origin, and clinical stages and outcomes. Conclusions Implementation of GMTV tracks the pattern of changes of M. tuberculosis strains in different geographical areas, facilitates disease gene discoveries associated with drug resistance or different clinical sequelae, and automates comparative genomic analyses among M. tuberculosis strains

    Genome-wide sequence analyses of ethnic populations across Russia

    Get PDF
    The Russian Federation is the largest and one of the most ethnically diverse countries in the world, however no centralized reference database of genetic variation exists to date. Such data are crucial for medical genetics and essential for studying population history. The Genome Russia Project aims at filling this gap by performing whole genome sequencing and analysis of peoples of the Russian Federation. Here we report the characterization of genome-wide variation of 264 healthy adults, including 60 newly sequenced samples. People of Russia carry known and novel genetic variants of adaptive, clinical and functional consequence that in many cases show allele frequency divergence from neighboring populations. Population genetics analyses revealed six phylogeographic partitions among indigenous ethnicities corresponding to their geographic locales. This study presents a characterization of population-specific genomic variation in Russia with results important for medical genetics and for understanding the dynamic population history of the world's largest country

    Analytical “bake-off” of whole genome sequencing quality for the Genome Russia project using a small cohort for autoimmune hepatitis

    Get PDF
    <div><p>A comparative analysis of whole genome sequencing (WGS) and genotype calling was initiated for ten human genome samples sequenced by St. Petersburg State University Peterhof Sequencing Center and by three commercial sequencing centers outside of Russia. The sequence quality, efficiency of DNA variant and genotype calling were compared with each other and with DNA microarrays for each of ten study subjects. We assessed calling of SNPs, indels, copy number variation, and the speed of WGS throughput promised. Twenty separate QC analyses showed high similarities among the sequence quality and called genotypes. The ten genomes tested by the centers included eight American patients afflicted with autoimmune hepatitis (AIH), plus one case’s unaffected parents, in a prelude to discovering genetic influences in this rare disease of unknown etiology. The detailed internal replication and parallel analyses allowed the observation of two of eight AIH cases carrying a rare allele genotype for a previously described AIH-associated gene (<i>FTCD</i>), plus multiple occurrences of known <i>HLA-DRB1</i> alleles associated with AIH <i>(HLA-DRB1-03</i>:<i>01</i>:<i>01</i>, <i>13</i>:<i>01</i>:<i>01 and 7</i>:<i>01</i>:<i>01</i>). We also list putative SNVs in other genes as suggestive in AIH influence.</p></div

    Genotype comparison.

    No full text
    <p>(A) Concordance of WGS genotypes with microarray genotypes. The concordance was estimated based on the trio data as the ratio of microarray SNPs with identical genotypes in WGS results. (B) Comparison of the three WGS datasets between each other in terms of precision, sensitivity and F-measure for pairwise comparisons. Color legend is given on the top right. (C) Concordance of genotypes in the three WGS datasets for all variants, SNPs and indels. Color legend is given on the top right.</p
    corecore