45 research outputs found

    Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    Get PDF
    BACKGROUND: Structural variants (SVs) are less common than single nucleotide polymorphisms and indels in the population, but collectively account for a significant fraction of genetic polymorphism and diseases. Base pair differences arising from SVs are on a much higher order (>100 fold) than point mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger than 1 kb. Excluding the 59 SVs (54 insertions/deletions, 5 inversions) that overlap with N-base gaps in the reference assembly hg19, 666 non-gap SVs remained, and 396 of them (60%) were verified by paired-end data from whole-genome sequencing-based re-sequencing or de novo assembly sequence from fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides valuable information for complex regions with haplotypes in a straightforward fashion. In addition, with long single-molecule labeling patterns, exogenous viral sequences were mapped on a whole-genome scale, and sample heterogeneity was analyzed at a new level. CONCLUSION: Our study highlights genome mapping technology as a comprehensive and cost-effective method for detecting structural variation and studying complex regions in the human genome, as well as deciphering viral integration into the host genome. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/2047-217X-3-34) contains supplementary material, which is available to authorized users

    Analysis of geothermal potential in Hangjiahu area based on remote sensing and geographic information system

    Get PDF
    Geothermal resources are one of the most valuable renewable energy sources because of their stability, reliability, cleanliness, safety and abundant reserves. Efficient and economical remote sensing and GIS (Geographic Information System) technology has high practical value in geothermal resources exploration. However, different study areas have different geothermal formation mechanisms. In the process of establishing the model, which factors are used for modeling and how to quantify the factors reasonably are still problems to be analyzed and studied. Taking Hangjiahu Plain of Zhejiang Province as an example, based on geothermal exploration and remote sensing interpretation data, the correlation between the existing geothermal hot spots and geothermal related factors was evaluated in this paper, such as lithology, fault zone distance, surface water system and its distance, seismic point distance, magmatic rock and volcanic rock distance, surface water, farmland, woodland temperature and so on. The relationship between geothermal potential and distribution characteristics of surface thermal environment, fault activity, surface water system and other factors was explored. AHP (Analytic Hierarchy Process) and BP (Back Propagation) neural network were used for establishing geothermal potential target evaluation models. The potential geothermal areas of Hangjiahu Plain were divided into five grades using geothermal exploration model, and most geothermal drilling sites were distributed in extremely high potential areas and high potential areas. The results show that it is feasible to analyze geothermal potential targets using remote sensing interpretation data and geographic information system analysis databased on analytic hierarchy process analytic hierarchy process and back propagation neural network, and the distribution characteristics of surface thermal environment, fault activity, surface water system and other related factors are also related to geothermal distribution. The prediction results of the model coincide with the existing geothermal drilling sites, which provides a new idea for geothermal exploration

    An atlas of DNA methylomes in porcine adipose and muscle tissues

    Get PDF
    It is evident that epigenetic factors, especially DNA methylation, have essential roles in obesity development. Here, using pig as a model, we investigate the systematic association between DNA methylation and obesity. We sample eight variant adipose and two distinct skeletal muscle tissues from three pig breeds living within comparable environments but displaying distinct fat level. We generate 1,381 Gb of sequence data from 180 methylated DNA immunoprecipitation libraries, and provide a genome-wide DNA methylation map as well as a gene expression map for adipose and muscle studies. The analysis shows global similarity and difference among breeds, sexes and anatomic locations, and identifies the differentially methylated regions. The differentially methylated regions in promoters are highly associated with obesity development via expression repression of both known obesity-related genes and novel genes. This comprehensive map provides a solid basis for exploring epigenetic mechanisms of adipose deposition and muscle growth

    Novel variation and <i>de novo </i>mutation rates in population-wide <i>de novo</i> assembled Danish trios

    Get PDF
    Building a population-specific catalogue of single nucleotide variants (SNVs), indels and structural variants (SVs) with frequencies, termed a national pan-genome, is critical for further advancing clinical and public health genetics in large cohorts. Here we report a Danish pan-genome obtained from sequencing 10 trios to high depth (50 × ). We report 536k novel SNVs and 283k novel short indels from mapping approaches and develop a population-wide de novo assembly approach to identify 132k novel indels larger than 10 nucleotides with low false discovery rates. We identify a higher proportion of indels and SVs than previous efforts showing the merits of high coverage and de novo assembly approaches. In addition, we use trio information to identify de novo mutations and use a probabilistic method to provide direct estimates of 1.27e−8 and 1.5e−9 per nucleotide per generation for SNVs and indels, respectively

    Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

    Get PDF
    Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set of structural variants including many novel insertions and demonstrate how this variant catalogue enables further deciphering of known association mapping signals. We leverage the assemblies to provide 100 completely resolved major histocompatibility complex haplotypes and to resolve major parts of the Y chromosome. Our study provides a regional reference genome that we expect will improve the power of future association mapping studies and hence pave the way for precision medicine initiatives, which now are being launched in many countries including Denmark

    ShujiaHuang/qmplot

    No full text
    A Python package for creating high-quality manhattan and Q-Q plots from GWAS results
    corecore