605 research outputs found

    Analysis of five deep-sequenced trio-genomes of the Peninsular Malaysia Orang Asli and North Borneo populations

    Get PDF
    BackgroundRecent advances in genomic technologies have facilitated genome-wide investigation of human genetic variations. However, most efforts have focused on the major populations, yet trio genomes of indigenous populations from Southeast Asia have been under-investigated.ResultsWe analyzed the whole-genome deep sequencing data (30x) of five native trios from Peninsular Malaysia and North Borneo, and characterized the genomic variants, including single nucleotide variants (SNVs), small insertions and deletions (indels) and copy number variants (CNVs). We discovered approximately 6.9 million SNVs, 1.2 million indels, and 9000 CNVs in the 15 samples, of which 2.7% SNVs, 2.3% indels and 22% CNVs were novel, implying the insufficient coverage of population diversity in existing databases. We identified a higher proportion of novel variants in the Orang Asli (OA) samples, i.e., the indigenous people from Peninsular Malaysia, than that of the North Bornean (NB) samples, likely due to more complex demographic history and long-time isolation of the OA groups. We used the pedigree information to identify de novo variants and estimated the autosomal mutation rates to be 0.81x10(-8) - 1.33x10(-8), 1.0x10(-9) - 2.9x10(-9), and 0.001 per site per generation for SNVs, indels, and CNVs, respectively. The trio-genomes also allowed for haplotype phasing with high accuracy, which serves as references to the future genomic studies of OA and NB populations. In addition, high-frequency inherited CNVs specific to OA or NB were identified. One example is a 50-kb duplication in DEFA1B detected only in the Negrito trios, implying plausible effects on host defense against the exposure of diverse microbial in tropical rainforest environment of these hunter-gatherers. The CNVs shared between OA and NB groups were much fewer than those specific to each group. Nevertheless, we identified a 142-kb duplication in AMY1A in all the 15 samples, and this gene is associated with the high-starch diet. Moreover, novel insertions shared with archaic hominids were identified in our samples.ConclusionOur study presents a full catalogue of the genome variants of the native Malaysian populations, which is a complement of the genome diversity in Southeast Asians. It implies specific population history of the native inhabitants, and demonstrated the necessity of more genome sequencing efforts on the multi-ethnic native groups of Malaysia and Southeast Asia

    Sacral agenesis: a pilot whole exome sequencing and copy number study

    Get PDF
    Background: Caudal regression syndrome (CRS) or sacral agenesis is a rare congenital disorder characterized by a constellation of congenital caudal anomalies affecting the caudal spine and spinal cord, the hindgut, the urogenital system, and the lower limbs. CRS is a complex condition, attributed to an abnormal development of the caudal mesoderm, likely caused by the effect of interacting genetic and environmental factors. A well-known risk factor is maternal type 1 diabetes. Method: Whole exome sequencing and copy number variation (CNV) analyses were conducted on 4 Caucasian trios to identify de novo and inherited rare mutations. Results: In this pilot study, exome sequencing and copy number variation (CNV) analyses implicate a number of candidate genes, including SPTBN5, MORN1, ZNF330, CLTCL1 and PDZD2. De novo mutations were found in SPTBN5, MORN1 and ZNF330 and inherited predicted damaging mutations in PDZD2 (homozygous) and CLTCL1 (compound heterozygous). Importantly, predicted damaging mutations in PTEN (heterozygous), in its direct regulator GLTSCR2 (compound heterozygous) and in VANGL1 (heterozygous) were identified. These genes had previously been linked with the CRS phenotype. Two CNV deletions, one de novo (chr3q13.13) and one homozygous (chr8p23.2), were detected in one of our CRS patients. These deletions overlapped with CNVs previously reported in patients with similar phenotype. Conclusion: Despite the genetic diversity and the complexity of the phenotype, this pilot study identified genetic features common across CRS patients

    Detecting cryptic clinically relevant structural variation in exome-sequencing data increases diagnostic yield for developmental disorders.

    Get PDF
    Structural variation (SV) describes a broad class of genetic variation greater than 50 bp in size. SVs can cause a wide range of genetic diseases and are prevalent in rare developmental disorders (DDs). Individuals presenting with DDs are often referred for diagnostic testing with chromosomal microarrays (CMAs) to identify large copy-number variants (CNVs) and/or with single-gene, gene-panel, or exome sequencing (ES) to identify single-nucleotide variants, small insertions/deletions, and CNVs. However, individuals with pathogenic SVs undetectable by conventional analysis often remain undiagnosed. Consequently, we have developed the tool InDelible, which interrogates short-read sequencing data for split-read clusters characteristic of SV breakpoints. We applied InDelible to 13,438 probands with severe DDs recruited as part of the Deciphering Developmental Disorders (DDD) study and discovered 63 rare, damaging variants in genes previously associated with DDs missed by standard SNV, indel, or CNV discovery approaches. Clinical review of these 63 variants determined that about half (30/63) were plausibly pathogenic. InDelible was particularly effective at ascertaining variants between 21 and 500 bp in size and increased the total number of potentially pathogenic variants identified by DDD in this size range by 42.9%. Of particular interest were seven confirmed de novo variants in MECP2, which represent 35.0% of all de novo protein-truncating variants in MECP2 among DDD study participants. InDelible provides a framework for the discovery of pathogenic SVs that are most likely missed by standard analytical workflows and has the potential to improve the diagnostic yield of ES across a broad range of genetic diseases

    Novel variation and <i>de novo </i>mutation rates in population-wide <i>de novo</i> assembled Danish trios

    Get PDF
    Building a population-specific catalogue of single nucleotide variants (SNVs), indels and structural variants (SVs) with frequencies, termed a national pan-genome, is critical for further advancing clinical and public health genetics in large cohorts. Here we report a Danish pan-genome obtained from sequencing 10 trios to high depth (50 × ). We report 536k novel SNVs and 283k novel short indels from mapping approaches and develop a population-wide de novo assembly approach to identify 132k novel indels larger than 10 nucleotides with low false discovery rates. We identify a higher proportion of indels and SVs than previous efforts showing the merits of high coverage and de novo assembly approaches. In addition, we use trio information to identify de novo mutations and use a probabilistic method to provide direct estimates of 1.27e−8 and 1.5e−9 per nucleotide per generation for SNVs and indels, respectively
    • …
    corecore