Search CORE

103 research outputs found

Allele-specific copy-number discovery from whole-genome and whole-exome sequencing

Author: Crowley James J.
Sun Wei
Szatkiewicz Jin P.
Wang Wei
Wang Weibo
Publication venue
Publication date: 01/01/2014
Field of study

Copy-number variants (CNVs) are a major form of genetic variation and a risk factor for various human diseases, so it is crucial to accurately detect and characterize them. It is conceivable that allele-specific reads from high-throughput sequencing data could be leveraged to both enhance CNV detection and produce allele-specific copy number (ASCN) calls. Although statistical methods have been developed to detect CNVs using whole-genome sequence (WGS) and/or whole-exome sequence (WES) data, information from allele-specific read counts has not yet been adequately exploited. In this paper, we develop an integrated method, called AS-GENSENG, which incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data. To evaluate the performance of AS-GENSENG, we conducted extensive simulations, generated empirical data using existing WGS and WES data sets and validated predicted CNVs using an independent methodology. We conclude that AS-GENSENG not only predicts accurate ASCN calls but also improves the accuracy of total copy number calls, owing to its unique ability to exploit information from both total and allele-specific read counts while accounting for various experimental biases in sequence data. Our novel, user-friendly and computationally efficient method and a complete analytic protocol is freely available at https://sourceforge.net/projects/asgenseng/

CiteSeerX

PubMed Central

Carolina Digital Repository

eScholarship - University of California

Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation

Author: Sullivan Patrick F.
Sun Wei
Szatkiewicz Jin P.
Wang Waibo
Wang Wei
Publication venue
Publication date: 01/01/2013
Field of study

Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative data (e.g. read-depth, read-pair, split-read) need be considered, and that sophisticated methods are needed for more accurate CNV detection. We observed that various sources of experimental biases in HTS confound read-depth estimation, and note that bias correction has not been adequately addressed by existing methods. We present a novel read-depth–based method, GENSENG, which uses a hidden Markov model and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS data sets, we conclude that our method outperforms existing read-depth–based CNV detection algorithms. The concept of simultaneous bias correction and CNV detection can serve as a basis for combining read-depth with other types of information such as read-pair or split-read in a single analysis. A user-friendly and computationally efficient implementation of our method is freely available

CiteSeerX

Carolina Digital Repository

A New Method for Detecting Associations with Rare Copy-Number Variants

Author: Magnusson Patrik K. E.
Sullivan Patrick F.
Szatkiewicz Jin P.
Tzeng Jung-Ying
Publication venue
Publication date: 01/01/2015
Field of study

Copy number variants (CNVs) play an important role in the etiology of many diseases such as cancers and psychiatric disorders. Due to a modest marginal effect size or the rarity of the CNVs, collapsing rare CNVs together and collectively evaluating their effect serves as a key approach to evaluating the collective effect of rare CNVs on disease risk. While a plethora of powerful collapsing methods are available for sequence variants (e.g., SNPs) in association analysis, these methods cannot be directly applied to rare CNVs due to the CNV-specific challenges, i.e., the multi-faceted nature of CNV polymorphisms (e.g., CNVs vary in size, type, dosage, and details of gene disruption), and etiological heterogeneity (e.g., heterogeneous effects of duplications and deletions that occur within a locus or in different loci). Existing CNV collapsing analysis methods (a.k.a. the burden test) tend to have suboptimal performance due to the fact that these methods often ignore heterogeneity and evaluate only the marginal effects of a CNV feature. We introduce CCRET, a random effects test for collapsing rare CNVs when searching for disease associations. CCRET is applicable to variants measured on a multi-categorical scale, collectively modeling the effects of multiple CNV features, and is robust to etiological heterogeneity. Multiple confounders can be simultaneously corrected. To evaluate the performance of CCRET, we conducted extensive simulations and analyzed large-scale schizophrenia datasets. We show that CCRET has powerful and robust performance under multiple types of etiological heterogeneity, and has performance comparable to or better than existing methods when there is no heterogeneity

Carolina Digital Repository

CGDSNPdb: a database resource for error-checked and imputed mouse SNPs

Author: Churchill Gary A.
de Villena Fernando Pardo-Manuel
Ding Yueming
Graber Joel H.
Hutchins Lucie N.
Smith Randy Von
Szatkiewicz Jin P.
Yang Hyuna
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

The Center for Genome Dynamics Single Nucleotide Polymorphism Database (CGDSNPdb) is an open-source value-added database with more than nine million mouse single nucleotide polymorphisms (SNPs), drawn from multiple sources, with genotypes assigned to multiple inbred strains of laboratory mice. All SNPs are checked for accuracy and annotated for properties specific to the SNP as well as those implied by changes to overlapping protein-coding genes. CGDSNPdb serves as the primary interface to two unique data sets, the ‘imputed genotype resource’ in which a Hidden Markov Model was used to assess local haplotypes and the most probable base assignment at several million genomic loci in tens of strains of mice, and the Affymetrix Mouse Diversity Genotyping Array, a high density microarray with over 600 000 SNPs and over 900 000 invariant genomic probes. CGDSNPdb is accessible online through either a web-based query tool or a MySQL public login

The Jackson Laboratory: The Mouseion at the JAXlibrary

PubMed Central

Carolina Digital Repository

The Recombinational Anatomy of a Mouse Chromosome

Author: Broman Karl W.
Graber Joel H.
Leahy Nicole
Ng Siemon H. S.
Paigen Kenneth
Parvanov Emil D.
Petkov Petko M.
Sawyer Kathryn
Szatkiewicz Jin P.
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Among mammals, genetic recombination occurs at highly delimited sites known as recombination hotspots. They are typically 1–2 kb long and vary as much as a 1,000-fold or more in recombination activity. Although much is known about the molecular details of the recombination process itself, the factors determining the location and relative activity of hotspots are poorly understood. To further our understanding, we have collected and mapped the locations of 5,472 crossover events along mouse Chromosome 1 arising in 6,028 meioses of male and female reciprocal F1 hybrids of C57BL/6J and CAST/EiJ mice. Crossovers were mapped to a minimum resolution of 225 kb, and those in the telomere-proximal 24.7 Mb were further mapped to resolve individual hotspots. Recombination rates were evolutionarily conserved on a regional scale, but not at the local level. There was a clear negative-exponential relationship between the relative activity and abundance of hotspot activity classes, such that a small number of the most active hotspots account for the majority of recombination. Females had 1.2× higher overall recombination than males did, although the sex ratio showed considerable regional variation. Locally, entirely sex-specific hotspots were rare. The initiation of recombination at the most active hotspot was regulated independently on the two parental chromatids, and analysis of reciprocal crosses indicated that parental imprinting has subtle effects on recombination rates. It appears that the regulation of mammalian recombination is a complex, dynamic process involving multiple factors reflecting species, sex, individual variation within species, and the properties of individual hotspots

Public Library of Science (PLOS)

The Jackson Laboratory: The Mouseion at the JAXlibrary

Directory of Open Access Journals

PubMed Central

An imputed genotype resource for the laboratory mouse

Author: Beane Glen L.
Churchill Gary A.
Ding Yueming
Hutchins Lucie
Pardo-Manuel de Villena Fernando
Szatkiewicz Jin P.
Publication venue
Publication date: 01/01/2008
Field of study

We have created a high-density SNP resource encompassing 7.87 million polymorphic loci across 49 inbred mouse strains of the laboratory mouse by combining data available from public databases and training a hidden Markov model to impute missing genotypes in the combined data. The strong linkage disequilibrium found in dense sets of SNP markers in the laboratory mouse provides the basis for accurate imputation. Using genotypes from eight independent SNP resources, we empirically validated the quality of the imputed genotypes and demonstrate that they are highly reliable for most inbred strains. The imputed SNP resource will be useful for studies of natural variation and complex traits. It will facilitate association study designs by providing high density SNP genotypes for large numbers of mouse strains. We anticipate that this resource will continue to evolve as new genotype data become available for laboratory mouse strains. The data are available for bulk download or query at http://cgd.jax.org/

PubMed Central

Carolina Digital Repository

Characterization of single gene copy number variants in schizophrenia

Author: Ancalade NaEshia
Bergen Sarah
Crowley James J.
Fromer Menachem
Holmans Peter
Hultman Christina
Johnson Jessica S.
Kirov George
Nonneman Randal J.
O'Donovan Michael
Owen Michael
Purcell Shaun M.
Rees Elliott
Ruderfer Douglas M.
Sklar Pamela
Stahl Eli A.
Sullivan Patrick F.
Szatkiewicz Jin P.
Publication venue: 'Elsevier BV'
Publication date: 15/04/2020
Field of study

Background Genetic studies of schizophrenia have implicated numerous risk loci including several copy number variants (CNVs) of large effect and hundreds of loci of small effect. In only a few cases has a specific gene been clearly identified. Rare CNVs affecting a single gene offer a potential avenue to discovering schizophrenia risk genes. Methods CNVs were generated from exome-sequencing of 4,913 schizophrenia cases and 6,188 controls from Sweden. We integrated multiple CNV calling methods (XHMM and ExomeDepth) to expand our set of single-gene CNVs and leveraged two different approaches for validating these variants (qPCR and Nanostring). Results We found a significant excess of all rare CNVs (deletions p=0.0004, duplications p=0.0006) and single-gene CNVs (deletions p=0.04, duplications p=0.03) in schizophrenia cases compared to controls. An expanded set of CNVs generated from integrating multiple approaches showed a significant burden of deletions in 11/21 gene-sets previously implicated in schizophrenia and across all genes in those sets (p=0.008), although no tests survived correction. We performed an extensive validation of all deletions in the significant set of voltage-gated calcium channels among CNVs called from both exome-sequencing and genotyping arrays. In total, 4 exonic, single-gene deletions validated in cases and none in controls (p=0.039), of which all were identified by exome-sequencing. Conclusions These results point to the potential contribution of single-gene CNVs to schizophrenia, that the utility of exome-sequencing for CNV calling has yet to be maximized and single-gene CNVs should be included in gene focused studies using other classes of variation

Online Research @ Cardiff

An inherited duplication at the gene p21 protein-activated Kinase 7 (PAK7) is a risk factor for psychosis

Author: Bellini Stefania
Blackwood Douglas
Buizer Jacobine
Coe Bradley
Cormican Paul
Corvin Aiden
Craddock Nick
Dinan Timothy G.
Donohoe Gary
Eichler Evan E.
Elves Rachel L.
Ennis Sean
Fahey Ciara
Freeman Colin
Giannoulatou Eleni
Gill Michael
Grozeva Detelina
Gurling Hugh
Hultman Christina
Johnstone Mandy
Kelleher Eric
Kendler Kenneth S.
Kenny Elaine M.
Kirov George
Maher Brion S.
McDonald Colm
Mcquillin Andrew
Molinos Ines
Morris Derek W.
Murphy Kieran C.
O'Callaghan Eadbhard
O'Donovan Michael
O'Dushlaine Colm T.
O'Neill Francis A.
Ophoff Roel
Pearson Richard D.
Perreault Louis Philippe Lemieux
Pirinen Matti
Purcell Shaun
Rees Elliott
Regan Regina
Riley Brien P.
Scolnick Ed
SGENE+ Consortium
Sklar Pamela
Spencer Chris C. A.
St Clair David
Stone Jennifer
Strange Amy
Sullivan Patrick
Szatkiewicz Jin
The International Schizophrenia Consortium (ISC)
The Wellcome Trust Case Control Consortium 2 (WTCCC2)
Thiselton Dawn L.
Tropea Daniela
Waddington John L.
Walsh Dermot
Walters James
Wormley Brandon
Publication venue: 'Oxford University Press (OUP)'
Publication date: 28/01/2014
Field of study

FUNDING Funding for this study was provided by the Wellcome Trust Case Control Consortium 2 project (085475/B/08/Z and 085475/Z/08/Z), the Wellcome Trust (072894/Z/03/Z, 090532/Z/09/Z and 075491/Z/04/B), NIMH grants (MH 41953 and MH083094) and Science Foundation Ireland (08/IN.1/B1916). We acknowledge use of the Trinity Biobank sample from the Irish Blood Transfusion Service; the Trinity Centre for High Performance Computing; British 1958 Birth Cohort DNA collection funded by the Medical Research Council (G0000934) and the Wellcome Trust (068545/Z/02) and of the UK National Blood Service controls funded by the Wellcome Trust. Chris Spencer is supported by a Wellcome Trust Career Development Fellowship (097364/Z/11/Z). Funding to pay the Open Access publication charges for this article was provided by the Wellcome Trust. ACKNOWLEDGEMENTS The authors sincerely thank all patients who contributed to this study and all staff who facilitated their involvement. We thank W. Bodmer and B. Winney for use of the People of the British Isles DNA collection, which was funded by the Wellcome Trust. We thank Akira Sawa and Koko Ishzuki for advice on the PAK7–DISC1 interaction experiment and Jan Korbel for discussions on mechanism of structural variation.Peer reviewedPublisher PD

Aberdeen University Research

Crossref

Online Research @ Cardiff

PubMed Central

Oxford University Research Archive

Association of Candidate Genes with Phenotypic Traits Relevant to Anorexia Nervosa

This analysis is a follow-up to an earlier investigation of 182 genes selected as likely candidate genetic variations conferring susceptibility to anorexia nervosa (AN). As those initial case-control results revealed no statistically significant differences in single nucleotide polymorphisms, herein we investigate alternative phenotypes associated with AN. In 1762 females using regression analyses we examined: (1) lowest illness-related attained body mass index; (2) age at menarche; (3) drive for thinness; (4) body dissatisfaction; (5) trait anxiety; (6) concern over mistakes; and (7) the anticipatory worry and pessimism vs. uninhibited optimism subscale of the harm avoidance scale. After controlling for multiple comparisons, no statistically significant results emerged. Although results must be viewed in the context of limitations of statistical power, the approach illustrates a means of potentially identifying genetic variants conferring susceptibility to AN because less complex phenotypes associated with AN are more proximal to the genotype and may be influenced by fewer genes

Carolina Digital Repository

A genome-wide association study of anorexia nervosa suggests a risk locus implicated in dysregulated leptin signaling

Author: Abrams Debra
Adan Roger A. H.
Alfredsson Lars
Ando Tetsuya
Andreassen Ole A.
Aschauer Harald
Baker Jessica H.
Barrett Jeff C.
Bencko Vladimir
Bergen Andrew W.
Berrettini Wade
Berrettini Wade H.
Bhoj Elizabeth J.
Boni Claudette
Boraska Perica Vesna
Bradfield Jonathan P.
Brandt Harry
Breen Gerome
Bruson Alice
Bulik Cynthia M.
Bulik Cynthia M.
Burghardt Roland
Bühren Katharina
Carlberg Laura
Chang Xiao
Chiavacci Rosetta M.
Cichon Sven
Clementi Maurizio
Cohen-Woods Sarah
Cone Roger
Connolly John J.
Crawford Steve
Crow Scott
Danner Unna N.
Davis Oliver S. P.
de Zwaan Martina
Dedoussis George
Degortes Daniela
DeSocio Janiece E.
Dick Danielle M.
Dikeos Dimitris
Dina Christian
Ding Bo
Dmitrzak-Weglarz Monika
Docampo Elisa
Eating Disorders Working Group of the Psychiatric Genomics Consortium
Egberts Karin
Ehrlich Stefan
Escaramís Geòrgia
Esko Tõnu
Espeseth Thomas
Estivill Xavier
Farmer Anne
Favaro Angela
Fernánde-Aranda Fernando
Fichter Manfred M.
Finan Chris
Fischer Krista
Floyd James A. B.
Foretova Lenka
Forzan Monica
Franklin Christopher S.
Gaborieau Valerie
Gallinger Steven
Gambaro Giovanni
Giegling Ina
Gonidakis Fragiskos
Gorwood Philip
Gratacos Monica
Guo Yiran
Hakonarson Hakon
Halmi Katherine A.
Hauser Joanna
Hebebrand Johannes
Helder Sietske
Hendriks Judith
Herms Stefan
Herpertz-Dahlmann Beate
Herzog Wolfgang
Hilliard Christopher E.
Hinney Anke
Hou Cuiping
Huckins Laura M.
Hudson James I.
Huemer Julia
Imgart Hartmut
Inoko Hidetoshi
Janout Vladimir
Jiménez-Murcia Susana
Johnson Craig
Julia Antonio
Kalsi Gursharan
Kaplan Allan S.
Kaplan Allan S.
Kaprio Jaakko
Karhunen Leila
Karwautz Andreas
Kas Martien J. H.
Kaye Walter
Keel Pamela K.
Kennedy James L.
Keski-Rahkonen Anna
Kiezebrink Kirsty
Kim Cecilia E.
Klareskog Lars
Klump Kelly L.
Klump Kelly L.
Knudsen Gun Peggy S.
Koeleman Bobby P. C.
Koubek Doris
La Via Maria C.
Le Hellard Stephanie
Lemma Maria
Levitan Robert D.
Li Bingshan
Li Dong
Li Jin
Li Yun R.
Lichtenstein Paul
Lilenfeld Lisa
Lissowska Jolanta
Liu Yichuan
Lundervold Astri
Magistretti Pierre
Maj Mario
Marsal Sara
Martaskova Debora
Mattingsdal Morten
McGuffin Peter
Mentch Frank D.
Merl Elisabeth
Metspalu Andres
Meulenbelt Ingrid
Mitchell James
Monteleone Palmiero
Männik Katrin
Navratilova Marie
Ntalla Ioanna
O'Toole Julie K.
Ophoff Roel A.
Padyukov Leonid
Palotie Aarno
Pantel Jacques
Papezova Hana
Pinto Dalila
Price Foundation Collaborative Group
Qiu Haijun
Rabionet Raquel
Raevuori-Helkamaa Anu
Rajewski Andrzej
Ramoz Nicolas
Rayner N. William
Reichborn-Kjennerud Ted
Reinvang Ivar
Ripatti Samuli
Roberts Marion
Robinson Nora
Rotondo Alessandro
Rujescu Dan
Rybakowski Filip
Santonastaso Paolo
Scherag André
Scherer Stephen W.
Schmidt Ulrike
Schork Nicholas J.
Schosser Alexandra
Schreiber Stefan
Slachtova Lenka
Sladek Rob
Slagboom P. Eline
Sleiman Patrick A.
Slof-Op't Landt Margarita C. T.
Slopien Agnieszka
Snyder James
Soranzo Nicole
Southam Lorraine
Steen Vidar M.
Strengman Eric
Strober Michael
Sullivan Patrick F.
Szatkiewicz Jin P.
Szeszenia-Dabrowska Neonila
Tachmazidou Ioanna
Tenconi Elena
Thomas Kelly A.
Thornton Laura M.
Thornton Laura M.
Tian Lifeng
Tortorella Alfonso
Tozzi Federica
Treasure Janet
Treasure Janet
Tsitsika Artemis
Tziouvas Konstantinos
van Elburg Annemarie A.
Van Furth Eric F.
Versini Audrey
Wagner Gudrun
Wang Fengxiang
Wei Zhi
Wichmann H.-Erich
Widén Elisabeth
Woodside D. Blake
Yilmaz Zeynep
Zeggini Eleftheria
Zerwas Stephanie
Zipfel Stephan
Publication venue
Publication date: 01/01/2017
Field of study

J. Kaprio, A. Palotie, A. Raevuori-Helkamaa ja S. Ripatti ovat työryhmän Eating Disorders Working Group of the Psychiatric Genomics Consortium jäseniä. Erratum in: Sci Rep. 2017 Aug 21;7(1):8379, doi: 10.1038/s41598-017-06409-3We conducted a genome-wide association study (GWAS) of anorexia nervosa (AN) using a stringently defined phenotype. Analysis of phenotypic variability led to the identification of a specific genetic risk factor that approached genome-wide significance (rs929626 in EBF1 (Early B-Cell Factor 1); P = 2.04 x 10(-7); OR = 0.7; 95% confidence interval (CI) = 0.61-0.8) with independent replication (P = 0.04), suggesting a variant-mediated dysregulation of leptin signaling may play a role in AN. Multiple SNPs in LD with the variant support the nominal association. This demonstrates that although the clinical and etiologic heterogeneity of AN is universally recognized, further careful sub-typing of cases may provide more precise genomic signals. In this study, through a refinement of the phenotype spectrum of AN, we present a replicable GWAS signal that is nominally associated with AN, highlighting a potentially important candidate locus for further investigation.Peer reviewe