Search CORE

190 research outputs found

BoostMe accurately predicts DNA methylation values in whole-genome bisulfite sequencing of multiple human tissues

Author: Arushi Varshney
D. Leland Taylor
Francis S. Collins
John P. Didion
Luli S. Zou
Michael R. Erdos
Peter S. Chines
Stephen C. J. Parker
The McDonnell Genome Institute
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2018
Field of study

Abstract Background Bisulfite sequencing is widely employed to study the role of DNA methylation in disease; however, the data suffer from biases due to coverage depth variability. Imputation of methylation values at low-coverage sites may mitigate these biases while also identifying important genomic features associated with predictive power. Results Here we describe BoostMe, a method for imputing low-quality DNA methylation estimates within whole-genome bisulfite sequencing (WGBS) data. BoostMe uses a gradient boosting algorithm, XGBoost, and leverages information from multiple samples for prediction. We find that BoostMe outperforms existing algorithms in speed and accuracy when applied to WGBS of human tissues. Furthermore, we show that imputation improves concordance between WGBS and the MethylationEPIC array at low WGBS depth, suggesting improved WGBS accuracy after imputation. Conclusions Our findings support the use of BoostMe as a preprocessing step for WGBS analysis.https://deepblue.lib.umich.edu/bitstream/2027.42/143848/1/12864_2018_Article_4766.pd

Crossref

Directory of Open Access Journals

Deep Blue Documents

The Metabochip, a Custom Genotyping Array for Genetic Studies of Metabolic, Cardiovascular, and Anthropometric Traits

Author: Andrew P. Morris
Anne U. Jackson
Antonella Mulas
Arne Pfeufer
Benjamin F
Benjamin F. Voight
Cameron D. Palmer
Cameron D. Palmer
Carlo Sidore
Carlo Sidore
Cecilia M. Lindgren
Christian Fuchsberger
Christopher Newton-cheh
Citable Link
Citation Voight
David Altshuler
Et Al
Francesco Cucca
Gonçalo R. Abecasis
Heribert Schunkert
Hyun Min Kang
Hyun Min Kang
Inga Prokopenko
Iris M. Heid
Jeanette Erdmann
Joel N. Hirschhorn
Joshua C. R
Jun Ding
Jun Ding
Kathleen Stirrups
Mark I. Mccarthy
Melissa Parkin
N. William Rayner
Neil Robertson
Nicole Soranzo Elizabeth K. Speliotes
Nilesh J. Samani
Noël P. Burtt
Noël P. Burtt
Panos Deloukas
Patricia B. Munroe
Peter S. Chines
Peter S. Chines
Ramaiah Nagaraja
Richa Saxena
Ruth J. F. Loos
Sekar Kathiresan
Serena Sanna
Simon Potter
Timothy M. Frayling
Toby Johnson
Tuomas O. Kilpeläinen
Wendy Winckler
Yanming Li
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

PMCID: PMC3410907This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Directory of Open Access Journals

Spiral - Imperial College Digital Repository

PuSH

Queen Mary Research Online

Public Library of Science (PLOS)

CiteSeerX

Crossref

Harvard University - DASH

PubMed Central

Copenhagen University Research Information System

Oxford University Research Archive

Leicester Research Archive

Identification of tag single-nucleotide polymorphisms in regions with varying linkage disequilibrium

Author: Agnes B Baffoe-Bonnie
Alison P Klein
Betty Q Doan
Elizabeth M Gillanders
Grace P Ibay
Ian P Dusenberry
Joan E Bailey-Wilson
Liang Ou
Peter S Chines
Priya Duggal
Rasika A Mathias
Ya-Yu Tsai
Publication venue: Springer Nature
Publication date: 30/12/2005
Field of study

We compared seven different tagging single-nucleotide polymorphism (SNP) programs in 10 regions with varied amounts of linkage disequilibrium (LD) and physical distance. We used the Collaborative Studies on the Genetics of Alcoholism dataset, part of the Genetic Analysis Workshop 14. We show that in regions with moderate to strong LD these programs are relatively consistent, despite different parameters and methods. In addition, we compared the selected SNPs in a multipoint linkage analysis for one region with strong LD. As the number of selected SNPs increased, the LOD score, mean information content, and type I error also increased

Springer - Publisher Connector

PubMed Central

Investigation of altering single-nucleotide polymorphism density on the power to detect trait loci and frequency of false positive in nonparametric linkage analyses of qualitative traits

Author: Bailey-Wilson Joan E
Barnhart Michael
Chines Peter S
Duggal Priya
Dusenberry Ian P
Gillanders Elizabeth M
Goldstein Janet
Hening Wayne
Klein Alison P
Mathias Rasika A
Pugh Elizabeth W
Tsai Ya-Yu
Turiff Amy
Wojciechowski Robert
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

Genome-wide linkage analysis using microsatellite markers has been successful in the identification of numerous Mendelian and complex disease loci. The recent availability of high-density single-nucleotide polymorphism (SNP) maps provides a potentially more powerful option. Using the simulated and Collaborative Study on the Genetics of Alcoholism (COGA) datasets from the Genetics Analysis Workshop 14 (GAW14), we examined how altering the density of SNP marker sets impacted the overall information content, the power to detect trait loci, and the number of false positive results. For the simulated data we used SNP maps with density of 0.3 cM, 1 cM, 2 cM, and 3 cM. For the COGA data we combined the marker sets from Illumina and Affymetrix to create a map with average density of 0.25 cM and then, using a sub-sample of these markers, created maps with density of 0.3 cM, 0.6 cM, 1 cM, 2 cM, and 3 cM. For each marker set, multipoint linkage analysis using MERLIN was performed for both dominant and recessive traits derived from marker loci. Our results showed that information content increased with increased map density. For the homogeneous, completely penetrant traits we created, there was only a modest difference in ability to detect trait loci. Additionally, as map density increased there was only a slight increase in the number of false positive results when there was linkage disequilibrium (LD) between markers. The presence of LD between markers may have led to an increased number of false positive regions but no clear relationship between regions of high LD and locations of false positive linkage signals was observed

Crossref

Springer - Publisher Connector

PubMed Central

Addressing Bias in Small RNA Library Preparation for Sequencing: A New Protocol Recovers MicroRNAs that Evade Capture by Current Methods

Author: Alice eYoung
C. Lisa eKurtz
Christina eSison
Emily E Fannin
Jeanette eBaran-Gale
Jeanette eBaran-Gale
Michael eErdos
Peter S Chines
Praveen eSethupathy
Praveen eSethupathy
Publication venue
Publication date: 01/01/2015
Field of study

Recent advances in sequencing technology have helped unveil the unexpected complexity and diversity of small RNAs. A critical step in small RNA library preparation for sequencing is the ligation of adapter sequences to both the 5′ and 3′ ends of small RNAs. Studies have shown that adapter ligation introduces a significant but widely unappreciated bias in the results of high-throughput small RNA sequencing. We show that due to this bias the two widely used Illumina library preparation protocols produce strikingly different microRNA (miRNA) expression profiles in the same batch of cells. There are 102 highly expressed miRNAs that are >5-fold differentially detected and some miRNAs, such as miR-24-3p, are over 30-fold differentially detected. While some level of bias in library preparation is not surprising, the apparent massive differential bias between these two widely used adapter sets is not well appreciated. In an attempt to mitigate this bias, the new Bioo Scientific NEXTflex V2 protocol utilizes a pool of adapters with random nucleotides at the ligation boundary. We show that this protocol is able to detect robustly several miRNAs that evade capture by the Illumina-based methods. While these analyses do not indicate a definitive gold standard for small RNA library preparation, the results of the NEXTflex protocol do correlate best with RT-qPCR. As increasingly more laboratories seek to study small RNAs, researchers should be aware of the extent to which the results may differ with different protocols, and should make an informed decision about the protocol that best fits their study

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

Carolina Digital Repository

Genetic regulatory signatures underlying islet gene expression and type 2 diabetes

The majority of genetic variants associated with type 2 diabetes (T2D) are located outside of genes in noncoding regions that may regulate gene expression in disease-relevant tissues, like pancreatic islets. Here, we present the largest integrated analysis to date of high-resolution, high-throughput human islet molecular profiling data to characterize the genome (DNA), epigenome (DNA packaging), and transcriptome (gene expression). We find that T2D genetic variants are enriched in regions of the genome where transcription Regulatory Factor X (RFX) is predicted to bind in an islet-specific manner. Genetic variants that increase T2D risk are predicted to disrupt RFX binding, providing a molecular mechanism to explain how the genome can influence the epigenome, modulating gene expression and ultimately T2D risk

Carolina Digital Repository

Integrative analysis of gene expression, DNA methylation, physiological traits, and genetic variation in human skeletal muscle

Author: Birney Ewan
Boehnke Michael
Chines Peter S.
Collins Francis S.
Didion John P.
Erdos Michael R.
Hemani Gibran
Idol Jackie
Jackson Anne U.
Kinnunen Leena
Koistinen Heikki A.
Laakso Markku
Lakka Timo A.
Narisu Narisu
Parker Stephen C. J.
Saramies Jouko
Scott Laura J.
Smith George Davey
Swift Amy
Taylor D. Leland
Tuomilehto Jaakko
Welch Ryan P.
Publication venue
Publication date: 28/05/2019
Field of study

We integrate comeasured gene expression and DNA methylation (DNAme) in 265 human skeletal muscle biopsies from the FUSION study with >7 million genetic variants and eight physiological traits: height, waist, weight, waist-hip ratio, body mass index, fasting serum insulin, fasting plasma glucose, and type 2 diabetes. We find hundreds of genes and DNAme sites associated with fasting insulin, waist, and body mass index, as well as thousands of DNAme sites associated with gene expression (eQTM). We find that controlling for heterogeneity in tissue/muscle fiber type reduces the number of physiological trait associations, and that long-range eQTMs (>1 Mb) are reduced when controlling for tissue/muscle fiber type or latent factors. We map genetic regulators (quantitative trait loci; QTLs) of expression (eQTLs) and DNAme (mQTLs). Using Mendelian randomization (MR) and mediation techniques, we leverage these genetic maps to predict 213 causal relationships between expression and DNAme, approximately two-thirds of which predict methylation to causally influence expression. We use MR to integrate FUSION mQTLs, FUSION eQTLs, and GTEx eQTLs for 48 tissues with genetic associations for 534 diseases and quantitative traits. We identify hundreds of genes and thousands of DNAme sites that may drive the reported disease/quantitative trait genetic associations. We identify 300 gene expression MR associations that are present in both FUSION and GTEx skeletal muscle and that show stronger evidence of MR association in skeletal muscle than other tissues, which may partially reflect differences in power across tissues. As one example, we find that increased RXRA muscle expression may decrease lean tissue mass.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Explore Bristol Research

Multiomic Profiling Identifies cis-Regulatory Networks Underlying Human Pancreatic β Cell Identity and Function.

Author: Aiden Aviva Presser
Aiden Erez Lieberman
Center NIH Intramural Sequencing
Chines Peter S
Collins Francis S
Dutra Amalia
Erdos Michael R
Fuchbserger Christian
Gu Huiya
Kanke Matt
Kursawe Romy
Lawlor Nathan
Li Xingwang
Luo Oscar Junhong
Marquez Eladio J
Narisu Narisu
Orchard Peter
Pak Evgenia
Parker Stephen C J
Piecuch Emaly
Ruan Yijun
Russell Sheikh
Sethupathy Praveen
Shamim Muhammad Saad
Stitzel Michael L
Thibodeau Asa
Ucar Duygu
Varshney Arushi
Publication venue: The Mouseion at the JAXlibrary
Publication date: 15/01/2019
Field of study

EndoC-βH1 is emerging as a critical human β cell model to study the genetic and environmental etiologies of β cell (dys)function and diabetes. Comprehensive knowledge of its molecular landscape is lacking, yet required, for effective use of this model. Here, we report chromosomal (spectral karyotyping), genetic (genotyping), epigenomic (ChIP-seq and ATAC-seq), chromatin interaction (Hi-C and Pol2 ChIA-PET), and transcriptomic (RNA-seq and miRNA-seq) maps of EndoC-βH1. Analyses of these maps define known (e.g., PDX1 and ISL1) and putative (e.g., PCSK1 and mir-375) β cell-specific transcriptional cis-regulatory networks and identify allelic effects on cis-regulatory element use. Importantly, comparison with maps generated in primary human islets and/or β cells indicates preservation of chromatin looping but also highlights chromosomal aberrations and fetal genomic signatures in EndoC-βH1. Together, these maps, and a web application we created for their exploration, provide important tools for the design of experiments to probe and manipulate the genetic programs governing β cell identity and (dys)function in diabetes

The Jackson Laboratory: The Mouseion at the JAXlibrary

Interactions between genetic variation and cellular environment in skeletal muscle gene expression

From whole organisms to individual cells, responses to environmental conditions are influenced by genetic makeup, where the effect of genetic variation on a trait depends on the environmental context. RNA-sequencing quantifies gene expression as a molecular trait, and is capable of capturing both genetic and environmental effects. In this study, we explore opportunities of using allele-specific expression (ASE) to discover cis-acting genotype-environment interactions (GxE)-genetic effects on gene expression that depend on an environmental condition. Treating 17 common, clinical traits as approximations of the cellular environment of 267 skeletal muscle biopsies, we identify 10 candidate environmental response expression quantitative trait loci (reQTLs) across 6 traits (12 unique gene-environment trait pairs; 10% FDR per trait) including sex, systolic blood pressure, and low-density lipoprotein cholesterol. Although using ASE is in principle a promising approach to detect GxE effects, replication of such signals can be challenging as validation requires harmonization of environmental traits across cohorts and a sufficient sampling of heterozygotes for a transcribed SNP. Comprehensive discovery and replication will require large human transcriptome datasets, or the integration of multiple transcribed SNPs, coupled with standardized clinical phenotyping.Peer reviewe

Crossref

Directory of Open Access Journals

Helsingin yliopiston digitaalinen arkisto

The Francis Crick Institute