24 research outputs found

    Systematic dissection of biases in whole-exome and whole-genome sequencing reveals major determinants of coding sequence coverage

    Get PDF
    Advantages and diagnostic effectiveness of the two most widely used resequencing approaches, whole exome (WES) and whole genome (WGS) sequencing, are often debated. WES dominated large-scale resequencing projects because of lower cost and easier data storage and processing. Rapid development of 3(rd) generation sequencing methods and novel exome sequencing kits predicate the need for a robust statistical framework allowing informative and easy performance comparison of the emerging methods. In our study we developed a set of statistical tools to systematically assess coverage of coding regions provided by several modern WES platforms, as well as PCR-free WGS. We identified a substantial problem in most previously published comparisons which did not account for mappability limitations of short reads. Using regression analysis and simple machine learning, as well as several novel metrics of coverage evenness, we analyzed the contribution from the major determinants of CDS coverage. Contrary to a common view, most of the observed bias in modern WES stems from mappability limitations of short reads and exome probe design rather than sequence composition. We also identified the similar to 500kb region of human exome that could not be effectively characterized using short read technology and should receive special attention during variant analysis. Using our novel metrics of sequencing coverage, we identified main determinants of WES and WGS performance. Overall, our study points out avenues for improvement of enrichment-based methods and development of novel approaches that would maximize variant discovery at optimal cost

    Identification of Novel Candidate Markers of Type 2 Diabetes and Obesity in Russia by Exome Sequencing with a Limited Sample Size

    Get PDF
    Type 2 diabetes (T2D) and obesity are common chronic disorders with multifactorial etiology. In our study, we performed an exome sequencing analysis of 110 patients of Russian ethnicity together with a multi-perspective approach based on biologically meaningful filtering criteria to detect novel candidate variants and loci for T2D and obesity. We have identified several known single nucleotide polymorphisms (SNPs) as markers for obesity (rs11960429), T2D (rs9379084, rs1126930), and body mass index (BMI) (rs11553746, rs1956549 and rs7195386) (p < 0.05). We show that a method based on scoring of case-specific variants together with selection of protein-altering variants can allow for the interrogation of novel and known candidate markers of T2D and obesity in small samples. Using this method, we identified rs328 in LPL (p = 0.023), rs11863726 in HBQ1 (p = 8 × 10−5), rs112984085 in VAV3 (p = 4.8 × 10−4) for T2D and obesity, rs6271 in DBH (p = 0.043), rs62618693 in QSER1 (p = 0.021), rs61758785 in RAD51B (p = 1.7 × 10−4), rs34042554 in PCDHA1 (p = 1 × 10−4), and rs144183813 in PLEKHA5 (p = 1.7 × 10−4) for obesity; and rs9379084 in RREB1 (p = 0.042), rs2233984 in C6orf15 (p = 0.030), rs61737764 in ITGB6 (p = 0.035), rs17801742 in COL2A1 (p = 8.5 × 10−5), and rs685523 in ADAMTS13 (p = 1 × 10−6) for T2D as important susceptibility loci in Russian population. Our results demonstrate the effectiveness of whole exome sequencing (WES) technologies for searching for novel markers of multifactorial diseases in cohorts of limited size in poorly studied populations

    Genome-wide sequence analyses of ethnic populations across Russia

    Get PDF
    The Russian Federation is the largest and one of the most ethnically diverse countries in the world, however no centralized reference database of genetic variation exists to date. Such data are crucial for medical genetics and essential for studying population history. The Genome Russia Project aims at filling this gap by performing whole genome sequencing and analysis of peoples of the Russian Federation. Here we report the characterization of genome-wide variation of 264 healthy adults, including 60 newly sequenced samples. People of Russia carry known and novel genetic variants of adaptive, clinical and functional consequence that in many cases show allele frequency divergence from neighboring populations. Population genetics analyses revealed six phylogeographic partitions among indigenous ethnicities corresponding to their geographic locales. This study presents a characterization of population-specific genomic variation in Russia with results important for medical genetics and for understanding the dynamic population history of the world's largest country

    Description of the First Registered Case of Lopes–Maciel–Rodan Syndrome in Russia

    Full text link
    Lopes–Maciel–Rodan syndrome (LOMARS) is an extremely rare disorder, with only a few cases reported worldwide. LOMARS is caused by a compound heterozygous mutation in the HTT gene. Little is known about LOMARS pathogenesis and clinical manifestations. Whole exome sequencing (WES) was performed to achieve a definitive molecular diagnosis of the disorder. All NGS-identified variants underwent the Sanger confirmation. In addition, a literature review on genetic variations in the HTT gene was conducted. The paper reports a case of LOMARS in a pediatric patient in Russia. A preterm girl of non-consanguineous parents demonstrated severe psychomotor developmental delays in her first 12 months. By the age of 6 years, she failed to develop speech but was able to understand everyday phrases and perform simple commands. Autism-like behaviors, stereotypies, and bruxism were noted during the examination. WES revealed two undescribed variants of unknown clinical significance in the HTT gene, presumably associated with the patient’s phenotype (c.2350C>T and c.8440C>A). Medical re-examination of parents revealed that the patient inherited these variants from her father and mother. Lopes–Maciel–Rodan syndrome was diagnosed based on overlapping clinical findings and the follow-up genetic examination of parents. Our finding expands the number of reported LOMARS cases and provides new insights into the genetic basis of the disease
    corecore