Search CORE

46 research outputs found

Conflation of short identity-by-descent segments bias their inferred length distribution

Author: Chiang Charleston W. K.
Novembre John
Ralph Peter
Publication venue: 'Genetics Society of America'
Publication date: 17/08/2015
Field of study

Identity-by-descent (IBD) is a fundamental concept in genetics with many applications. In a common definition, two haplotypes are said to contain an IBD segment if they share a segment that is inherited from a recent shared common ancestor without intervening recombination. Long IBD segments (> 1cM) can be efficiently detected by a number of algorithms using high-density SNP array data from a population sample. However, these approaches detect IBD based on contiguous segments of identity-by-state, and such segments may exist due to the conflation of smaller, nearby IBD segments. We quantified this effect using coalescent simulations, finding that nearly 40% of inferred segments 1-2cM long are results of conflations of two or more shorter segments, under demographic scenarios typical for modern humans. This biases the inferred IBD segment length distribution, and so can affect downstream inferences. We observed this conflation effect universally across different IBD detection programs and human demographic histories, and found inference of segments longer than 2cM to be much more reliable (less than 5% conflation rate). As an example of how this can negatively affect downstream analyses, we present and analyze a novel estimator of the de novo mutation rate using IBD segments, and demonstrate that the biased length distribution of the IBD segments due to conflation can lead to inflated estimates if the conflation is not modeled. Understanding the conflation effect in detail will make its correction in future methods more tractable

arXiv.org e-Print Archive

Directory of Open Access Journals

KLFDAPC : a supervised machine learning approach for spatial genetic structure analysis

Author: Chiang Charleston W K
Gaggiotti Oscar E
Qin Xinghu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 02/06/2022
Field of study

CSC-University of St Andrews Joint Scholarship (to X.Q.); International Postdoctoral Exchange Fellowship Program (Talent-Introduction Program) from China Postdoc Council (to X.Q.); National Institute of General Medical Sciences (NIGMS) of the National Institute of Health (grant R35GM142783 to C.W.K.C.). Part of the computation for this work is supported by USC’s Center for Advanced Research Computing (https://carc.usc.edu).Geographic patterns of human genetic variation provide important insights into human evolution and disease. A commonly used tool to detect and describe them is principal component analysis (PCA) or the supervised linear discriminant analysis of principal components (DAPC). However, genetic features produced from both approaches could fail to correctly characterize population structure for complex scenarios involving admixture. In this study, we introduce Kernel Local Fisher Discriminant Analysis of Principal Components (KLFDAPC), a supervised non-linear approach for inferring individual geographic genetic structure that could rectify the limitations of these approaches by preserving the multimodal space of samples. We tested the power of KLFDAPC to infer population structure and to predict individual geographic origin using neural networks. Simulation results showed that KLFDAPC has higher discriminatory power than PCA and DAPC. The application of our method to empirical European and East Asian genome-wide genetic datasets indicated that the first two reduced features of KLFDAPC correctly recapitulated the geography of individuals and significantly improved the accuracy of predicting individual geographic origin when compared to PCA and DAPC. Therefore, KLFDAPC can be useful for geographic ancestry inference, design of genome scans and correction for spatial stratification in GWAS that link genes to adaptation or disease susceptibility.Publisher PDFPeer reviewe

PubMed Central

University of St. Andrews - Pure

St Andrews Research Repository

Recommended from our members

Evidence of Widespread Selection on Standing Variation in Europe at Height-Associated SNPs

Author: Chiang Charleston W. K.
GIANT Consortium
Hirschhorn Joel Naom
Palmer Cameron Douglas
Reich David Emil
Sankararaman Sriram
Turchin Michael C.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/05/2013
Field of study

Strong signatures of positive selection at newly arising genetic variants are well-documented in humans, but this form of selection may not be widespread in recent human evolution. Because many human traits are highly polygenic and partly determined by common, ancient genetic variation, an alternative model for rapid genetic adaptation has been proposed: weak selection acting on many pre-existing (standing) genetic variants, or polygenic adaptation. By studying height, a classic polygenic trait, we demonstrate the first human signature of widespread selection on standing variation. We show that frequencies of alleles associated with increased height, both at known loci and genome-wide, are systematically elevated in Northern Europeans compared with Southern Europeans

(p<4.3×10^{−4})

. This pattern mirrors intra-European height differences and is not confounded by ancestry or other ascertainment biases. The systematic frequency differences are consistent with the presence of widespread weak selection (selection coefficients

~10^{−3}–10^{−5}

per allele) rather than genetic drift alone

(p<10^{−15})

Harvard University - DASH

Mitochondrial genome copy number measured by DNA sequencing in human blood is strongly associated with metabolic traits via cell-type composition differences

Author: Abel Haley
Boehnke Michael
Chen Lei
Chiang Charleston W. K.
Christ Ryan
Das Indraniel
Freimer Nelson
Ganel Liron
Hall Ira M.
Havulinna Aki
Kanchi Krishna
Kang Chul Joo
Kuusisto Johanna
Laakso Markku
Larson David
Locke Adam
Palotie Aarno
Regier Allison
Ripatti Samuli
Scott Alexandra
Service Susan
Stitziel Nathan O.
Vangipurapu Jagadish
Young Erica
Publication venue
Publication date: 01/06/2021
Field of study

Background Mitochondrial genome copy number (MT-CN) varies among humans and across tissues and is highly heritable, but its causes and consequences are not well understood. When measured by bulk DNA sequencing in blood, MT-CN may reflect a combination of the number of mitochondria per cell and cell-type composition. Here, we studied MT-CN variation in blood-derived DNA from 19184 Finnish individuals using a combination of genome (N = 4163) and exome sequencing (N = 19034) data as well as imputed genotypes (N = 17718). Results We identified two loci significantly associated with MT-CN variation: a common variant at the MYB-HBS1L locus (P = 1.6 x 10(-8)), which has previously been associated with numerous hematological parameters; and a burden of rare variants in the TMBIM1 gene (P = 3.0 x 10(-8)), which has been reported to protect against non-alcoholic fatty liver disease. We also found that MT-CN is strongly associated with insulin levels (P = 2.0 x 10(-21)) and other metabolic syndrome (metS)-related traits. Using a Mendelian randomization framework, we show evidence that MT-CN measured in blood is causally related to insulin levels. We then applied an MT-CN polygenic risk score (PRS) derived from Finnish data to the UK Biobank, where the association between the PRS and metS traits was replicated. Adjusting for cell counts largely eliminated these signals, suggesting that MT-CN affects metS via cell-type composition. Conclusion These results suggest that measurements of MT-CN in blood-derived DNA partially reflect differences in cell-type composition and that these differences are causally linked to insulin and related traits.Peer reviewe

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Helsingin yliopiston digitaalinen arkisto

Deep Blue Documents at the University of Michigan

Ultraconserved Elements in the Human Genome: Association and Transmission Analyses of Highly Constrained Single-Nucleotide Polymorphisms

Author: Boerwinkle Eric
Chiang Charleston W. K.
Cupples L. Adrienne
Demerath Ellen W.
Franceschini Nora
Hirschhorn Joel N.
Jorgensen Neal W.
Keating Brendan J.
Lange Leslie A.
Lettre Guillaume
Liu Ching-Ti
Murabito Joanne M.
Nock Nora L.
North Kari E.
Papanicolaou George J.
Reiner Alex P.
Rotter Jerome I.
Vedantam Sailaja
Wilson James G.
Publication venue
Publication date: 01/01/2012
Field of study

Ultraconserved elements in the human genome likely harbor important biological functions as they are dosage sensitive and are able to direct tissue-specific expression. Because they are under purifying selection, variants in these elements may have a lower frequency in the population but a higher likelihood of association with complex traits. We tested a set of highly constrained SNPs (hcSNPs) distributed genome-wide among ultraconserved and nearly ultraconserved elements for association with seven traits related to reproductive (age at natural menopause, number of children, age at first child, and age at last child) and overall [longevity, body mass index (BMI), and height] fitness. Using up to 24,047 European-American samples from the National Heart, Lung, and Blood Institute Candidate Gene Association Resource (CARe), we observed an excess of associations with BMI and height. In an independent replication panel the most strongly associated SNPs showed an 8.4-fold enrichment of associations at the nominal level, including three variants in previously identified loci and one in a locus (DENND1A) previously shown to be associated with polycystic ovary syndrome. Finally, using 1430 family trios, we showed that the transmissions from heterozygous parents to offspring of the derived alleles of rare (frequency ≤0.5%) hcSNPs are not biased, particularly after adjusting for the rates of genotype missingness and error in the data. The lack of transmission bias ruled out an immediately and strongly deleterious effect due to the rare derived alleles, consistent with the observation that mice homozygous for the deletion of ultraconserved elements showed no overt phenotype. Our study also illustrated the importance of carefully modeling potential technical confounders when analyzing genotype data of rare variants

PubMed Central

Carolina Digital Repository

Concept, Design and Implementation of a Cardiovascular Gene-Centric 50 K SNP Array for Large-Scale Genomic Association Studies

Author: Ajmal Saad
Anand Sonia S.
Bailey Swneke D.
Barrett Jeffrey C.
Bhangale Tushar
Boehnke Michael
Boerwinkle Eric
Cappola Thomas P.
Caulfield Mark
Chandrupatla Hareesh R.
Chiang Charleston W. K.
de Bakker Paul I Wen
DerOhannessian Stephanie
Drake Thomas
Edmondson Andrew C.
Engert James C.
Fabsitz Richard R.
Farlow Deborah N.
FitzGerald Garret A.
Fornage Myriam
Frackelton Edward
Gabriel Stacey B.
Gai Xiaowu
Galver Luana
Glessner Joseph T.
Grant Struan F. A.
Groop Leif
Guo Yiran
Hakonarson Hakon
Hall Alistair S.
Hansen Mark
Hattersley Andrew T.
Hirschhorn Joel Naom
Kathiresan Sekar
Keating Brendan J.
Kim Cecelia E.
Koenig Wolfgang
Li Mingyao
Lusis A. Jake
McCarthy Mark I.
Montpetit Alexandre
Munroe Patricia
Murray Sarah S.
Nickerson Deborah A.
Ouwehand Willem
Papanicolaou George J.
Patterson Nick
Price Alkes
Price Thomas S.
Rader Daniel J.
Reich David Emil
Reilly Muredach
Reitsma Pieter H.
Samani Nilesh J.
Schadt Eric
Shaikh Tamim
Taylor Kent
Tischfield Sam
Wang Susanna S.
Whitehead A. Stephen
Wilson James G.
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

A wealth of genetic associations for cardiovascular and metabolic phenotypes in humans has been accumulating over the last decade, in particular a large number of loci derived from recent genome wide association studies (GWAS). True complex disease-associated loci often exert modest effects, so their delineation currently requires integration of diverse phenotypic data from large studies to ensure robust meta-analyses. We have designed a gene-centric 50 K single nucleotide polymorphism (SNP) array to assess potentially relevant loci across a range of cardiovascular, metabolic and inflammatory syndromes. The array utilizes a “cosmopolitan” tagging approach to capture the genetic diversity across ∼2,000 loci in populations represented in the HapMap and SeattleSNPs projects. The array content is informed by GWAS of vascular and inflammatory disease, expression quantitative trait loci implicated in atherosclerosis, pathway based approaches and comprehensive literature searching. The custom flexibility of the array platform facilitated interrogation of loci at differing stringencies, according to a gene prioritization strategy that allows saturation of high priority loci with a greater density of markers than the existing GWAS tools, particularly in African HapMap samples. We also demonstrate that the IBC array can be used to complement GWAS, increasing coverage in high priority CVD-related loci across all major HapMap populations. DNA from over 200,000 extensively phenotyped individuals will be genotyped with this array with a significant portion of the generated data being released into the academic domain facilitating in silico replication attempts, analyses of rare variants and cross-cohort meta-analyses in diverse populations. These datasets will also facilitate more robust secondary analyses, such as explorations with alternative genetic models, epistasis and gene-environment interactions

Directory of Open Access Journals

Queen Mary Research Online

DigitalCommons@The Texas Medical Center

Hal-Diderot

Public Library of Science (PLOS)

Crossref

LSHTM Research Online

Harvard University - DASH

Springer - Publisher Connector

HAL-Inserm

PubMed Central

Oxford University Research Archive

King's Research Portal

HAL UVSQ

Leicester Research Archive

A principal component meta-analysis on multiple anthropometric traits identifies novel loci for body shape

Author: Abecasis Goncalo R.
Ahluwalia Tarunveer S.
Albrecht Eva
Bakker Stephan J. L.
Barlassina Cristina
Bartz Traci M.
Beilby John
Bellis Claire
Bergman Richard N.
Bergmann Sven
Blangero John
Blüher Matthias
Boehnke Michael
Boerwinkle Eric
Bonnycastle Lori L.
Boomsma Dorret I.
Borecki Ingrid B.
Bornstein Stefan R.
Bouchard Claude
Bragg-Gresham Jennifer L.
Bruinenberg Marcel
Cadby Gemma
Campbell Harry
Carola Zillikens
Chambers John C.
Chasman Daniel I.
Chen Yii-Der Ida
Chiang Charleston W. K.
Chines Peter S.
Chu Audrey Y.
Collins Francis S
Couto Alves Alexessander
Cucca Fracensco
Cupples L Adrienne
Cusi Daniele
D'avila Francesca
De Geus Eco J.C.
Dedoussis George
Deloukas Panos
Dimitriou Maria
Döring Angela
Eklund Niina
Eriksson Joel
Eriksson Johan G.
Esko Toñu
Farmaki Aliki-Eleni
Farrall Martin
Feitosa Mary F.
Ferreira Teresa
Fischer Krista
Forouhi Nita G.
Fox Caroline
Frayling Timothy
Friedrich Nele
Gansevoort Ron T.
Gieger Christian
Gjesing Anette Prior
Glorioso Nicola
Goel Anuj
Gorski Mathias
Graff Mariaelisa
Grallert Harald
Grarup Niels
Grewal Jagvir
Groop Leif C.
Gräßler Jürgen
Hamsten Anders
Hansen Torben
Harder Marie Neergaard
Hartman Catharina A.
Hassinen Maija
Hastie Nicholas
Hattersley Andrew Tym
Havulinna Aki S.
Hayward Caroline
Heard-Costa Nancy L.
Heid Iris M.
Heliövaara Markku
Hicks Andrew A.
Hillege Hans
Hirschhorn Joel N.
Hofman Albert
Holmen Oddgeir
Homuth Georg
Hottenga Jouke-Jan
Hu Frank
Huffman Jennifer E.
Hui Jennie
Hunter David J.
Husemoen Lise Lotte
Hveem Kristian
Hysi Pirro G.
Isaacs Aaron
Ittermann Till
Jackson Anne U.
Jalilzadeh Shapour
James Alan L.
Jarvelin Marjo-Riitta
Jeff Janina M.
Jokinen Eero
Jousilahti Pekka
Ju Sung Yun
Jula Antti
Justice Anne E.
Jørgensen Torben
Kajantie Eero
Kanoni Stavroula
Kaplan Robert C.
Karaleftheri Maria
Keinanen-Kiukaanniemi Sirkka M.
Kinnunen Leena
Knekt Paul B.
Koistinen Heikki A.
Kolcic Ivana
Kooner Ishminder K.
Kooner Jaspal S.
Koskinen Seppo
Kovacs Peter
Kristiansson Kati
Kuh Diana
Kutalik Zoltán
Kuusisto Johanna
Kyriakou Theodosios
Kähönen Mika
Laakso Markku
Lahti Jari
Laitinen Tomi
Lakka Timo A.
Langenberg Claudia
Leach Irene Mateo
Lehtimäki Terho
Lewin Alexandra M.
Lichtner Peter
Lindgren Cecilia M.
Lindström Jaana
Linneberg Allan
Loos Ruth J. F.
Lorbeer Roberto
Lorentzon Mattias
Luan Jian'an
Luben Robert
Lyssenko Valeriya
Mahajan Anubha
Mangino Massimo
Manunta Paolo
Marie Justesen Johanne
Mcardle Wendy L.
Mccarthy Mark I.
Mcknight Barbara
Medina-Gomez Carolina
Metspalu Andres
Mihailov Evelin
Milani Lili
Mills Rebecca
Mohlke Karen L.
Monda Keri L.
Montasser May E.
Morris Andrew P.
Musk Arthur W.
Mägi Reedik
Männistö Satu
Müller Gabriele
Müller-Nurasyid Martina
Narisu Narisu
Njølstad Inger
Nolte Ilja M.
North Kari E.
O'connell Jeffrey R.
Ohlsson Claes
Oldehinkel Albertine J.
Ong Ken K.
Oostra Ben A.
Osmond Clive
Palmer Lyle J.
Palotie Aarno
Pankow James S.
Paternoster Lavinia
Pedersen Oluf
Penninx Brenda W.
Perola Markus
Peters Annette
Pichler Irene
Pilia Maria G.
Polašek Ozren
Pramstaller Peter P.
Prokopenko Inga
Psaty Bruce M.
Puolijoki Hannu
Pérusse Louis
Qi Lu
Raitakari Olli T
Rankinen Tuomo
Rao
Rauramaa Rainer
Rayner Nigel W.
Ribel-Madsen Rasmus
Rice Treva K.
Richards Marcus
Ridker Paul M.
Ried Janina S.
Rivadeneira Fernando
Rose Lynda M.
Rudan Igor
Ryan Kathy A.
Salomaa Veikko
Salvi Erika
Sanna Serena
Sarzynski Mark A.
Schlessinger David
Scholtens Salome
Schwarz Peter E. H.
Scott Robert A.
Sebert Sylvain
Shudiner Alan R.
Smit Jan H.
Smith Megan T.
Snieder Harold
Southam Lorraine
Sparsø Thomas Hempel
Spector Timothy D.
Standáková Alena
Stefansson Kari
Steinthorsdottir Valgerdur
Stirrups Kathleen
Stolk Ronald P.
Strachan David P.
Strauch Konstantin
Stringham Heather M.
Stumvoll Michael
Swertz Morris A.
Swift Amy J.
Sørensen Thorkild I. A.
Tachmazidou Ioanna
Tee Khaw Kay
Teumer Alexander
Thorleifsson Gudmar
Thorsteinsdottir Unnur
Tremblay Angelo
Tsafantakis Emmanouil
Tuomilehto Jaakko
Tönjes Anke
Uitterlinden André G.
Uusitupa Matti
Van Der Harst Pim
Van Der Most Peter J.
Van Dongen Jenny
Van Duijn Cornelia M.
Van Vliet-Ostaptchouk Jana V.
Vandenput Liesbeth
Vartiainen Erkki
Venturini Cristina
Verweij Niek
Viikari Jorma S.
Vitart Veronique
Vohl Marie-Claude
Vollenweider Peter
Vonk Judith M.
Völker Uwe
Waeber Gérard
Walker Ryan W.
Wang Sophie R.
Wareham Nicholas J.
Watkins Hugh
Widén Elisabeth
Wild Sarah H.
Willems Sara M.
Willemsen Gonneke
Wilsgaard Tom
Wilson James F.
Winkler Thomas W.
Wong Andrew
Wright Alan F.
Yerges-Armstrong Laura M.
Zeggini Eleftheria
Zhang Weihua
Zhao Jing Hua
Publication venue
Publication date: 01/01/2016
Field of study

Large consortia have revealed hundreds of genetic loci associated with anthropometric traits, one trait at a time. We examined whether genetic variants affect body shape as a composite phenotype that is represented by a combination of anthropometric traits. We developed an approach that calculates averaged PCs (AvPCs) representing body shape derived from six anthropometric traits (body mass index, height, weight, waist and hip circumference, waist-to-hip ratio). The first four AvPCs explain >99% of the variability, are heritable, and associate with cardiometabolic outcomes. We performed genome-wide association analyses for each body shape composite phenotype across 65 studies and meta-analysed summary statistics. We identify six novel loci: LEMD2 and CD47 for AvPC1, RPS6KA5/C14orf159 and GANAB for AvPC3, and ARL15 and ANP32 for AvPC4. Our findings highlight the value of using multiple traits to define complex phenotypes for discovery, which are not captured by single-trait analyses, and may shed light onto new pathways

Carolina Digital Repository