Search CORE

322 research outputs found

MAPPING GENES FOR QUANTITATIVE TRAITS USING SELECTED SAMPLES OF SIBLING PAIRS

Author: Szatkiewicz Jin Peng
Publication venue
Publication date: 22/07/2004
Field of study

One of the most important research areas in human genetics is the effort to map genes associated with complex diseases such as cancer, heart disease, and diabetes. The public health relevance of these kinds of work is that gene mapping will bring an understanding of genetic risk and protective factors, and a description of the interaction between environment and genetic variation. In the last ten years there has been a dramatic increase in the number of studies seeking to map genes for quantitative traits. This has caused an explosion of new work on statistical methods for human quantitative trait locus (QTL) mapping. However, little of that work has dealt with selected samples, which are more common than population samples for human studies. This dissertation focuses on sibling pairs and considers the most common types of selected sampling. I surveyed most QTL mapping methods in the literature to evaluate which are appropriate for selected samples, and also developed new statistics for selected samples. Using simulation and analytical approaches, I identified the most powerful statistics for each type of sampling considered. I then compared various sampling designs using the best statistic for each and gave guidelines for choosing appropriate and powerful designs under different scenarios

D-Scholarship@Pitt

Genome-wide association analysis identifies common variants influencing infant brain volumes

Author: Ahn
Crowley
Gilmore
Hibar
Jha
Knickmeyer
Li
Styner
Sullivan
Szatkiewicz
Thompson
Xia
Zhang
Zhu
Zou
Publication venue
Publication date: 01/01/2017
Field of study

Genome-wide association studies (GWAS) of adolescents and adults are transforming our understanding of how genetic variants impact brain structure and psychiatric risk, but cannot address the reality that psychiatric disorders are unfolding developmental processes with origins in fetal life. To investigate how genetic variation impacts prenatal brain development, we conducted a GWAS of global brain tissue volumes in 561 infants. An intronic single-nucleotide polymorphism (SNP) in IGFBP7 (rs114518130) achieved genome-wide significance for gray matter volume (P=4.15 × 10−10). An intronic SNP in WWOX (rs10514437) neared genome-wide significance for white matter volume (P=1.56 × 10−8). Additional loci with small P-values included psychiatric GWAS associations and transcription factors expressed in developing brain. Genetic predisposition scores for schizophrenia and ASD, and the number of genes impacted by rare copy number variants (CNV burden) did not predict global brain tissue volumes. Integration of these results with large-scale neuroimaging GWAS in adolescents (PNC) and adults (ENIGMA2) suggests minimal overlap between common variants impacting brain volumes at different ages. Ultimately, by identifying genes contributing to adverse developmental phenotypes, it may be possible to adjust adverse trajectories, preventing or ameliorating psychiatric and developmental disorders

Carolina Digital Repository

A randomized approach to speed up the analysis of large-scale read-count data in the application of CNV detection

Author: Sun Wei
Szatkiewicz Jin
Wang Wei
Wang WeiBo
Publication venue: BioMed Central
Publication date: 01/03/2018
Field of study

Abstract Background The application of high-throughput sequencing in a broad range of quantitative genomic assays (e.g., DNA-seq, ChIP-seq) has created a high demand for the analysis of large-scale read-count data. Typically, the genome is divided into tiling windows and windowed read-count data is generated for the entire genome from which genomic signals are detected (e.g. copy number changes in DNA-seq, enrichment peaks in ChIP-seq). For accurate analysis of read-count data, many state-of-the-art statistical methods use generalized linear models (GLM) coupled with the negative-binomial (NB) distribution by leveraging its ability for simultaneous bias correction and signal detection. However, although statistically powerful, the GLM+NB method has a quadratic computational complexity and therefore suffers from slow running time when applied to large-scale windowed read-count data. In this study, we aimed to speed up substantially the GLM+NB method by using a randomized algorithm and we demonstrate here the utility of our approach in the application of detecting copy number variants (CNVs) using a real example. Results We propose an efficient estimator, the randomized GLM+NB coefficients estimator (RGE), for speeding up the GLM+NB method. RGE samples the read-count data and solves the estimation problem on a smaller scale. We first theoretically validated the consistency and the variance properties of RGE. We then applied RGE to GENSENG, a GLM+NB based method for detecting CNVs. We named the resulting method as “R-GENSENG". Based on extensive evaluation using both simulated and empirical data, we concluded that R-GENSENG is ten times faster than the original GENSENG while maintaining GENSENG’s accuracy in CNV detection. Conclusions Our results suggest that RGE strategy developed here could be applied to other GLM+NB based read-count analyses, i.e. ChIP-seq data analysis, to substantially improve their computational efficiency while preserving the analytic power

Directory of Open Access Journals

Carolina Digital Repository

eScholarship - University of California

Mouse Phenome Database

Author: Bogue
C. J. Bult
Cervino
Kikkawa
M. A. Bogue
Mural
S. C. Grubb
Szatkiewicz
T. P. Maddatu
Wade
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

The Mouse Phenome Database (MPD; http://www.jax.org/phenome) is an open source, web-based repository of phenotypic and genotypic data on commonly used and genetically diverse inbred strains of mice and their derivatives. MPD is also a facility for query, analysis and in silico hypothesis testing. Currently MPD contains about 1400 phenotypic measurements contributed by research teams worldwide, including phenotypes relevant to human health such as cancer susceptibility, aging, obesity, susceptibility to infectious diseases, atherosclerosis, blood disorders and neurosensory disorders. Electronic access to centralized strain data enables investigators to select optimal strains for many systems-based research applications, including physiological studies, drug and toxicology testing, modeling disease processes and complex trait analysis. The ability to select strains for specific research applications by accessing existing phenotype data can bypass the need to (re)characterize strains, precluding major investments of time and resources. This functionality, in turn, accelerates research and leverages existing community resources. Since our last NAR reporting in 2007, MPD has added more community-contributed data covering more phenotypic domains and implemented several new tools and features, including a new interactive Tool Demo available through the MPD homepage (quick link: http://phenome.jax.org/phenome/trytools)

CiteSeerX

Crossref

The Jackson Laboratory: The Mouseion at the JAXlibrary

PubMed Central

CGDSNPdb: a database resource for error-checked and imputed mouse SNPs

Author: Churchill Gary A.
de Villena Fernando Pardo-Manuel
Ding Yueming
Graber Joel H.
Hutchins Lucie N.
Smith Randy Von
Szatkiewicz Jin P.
Yang Hyuna
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

The Center for Genome Dynamics Single Nucleotide Polymorphism Database (CGDSNPdb) is an open-source value-added database with more than nine million mouse single nucleotide polymorphisms (SNPs), drawn from multiple sources, with genotypes assigned to multiple inbred strains of laboratory mice. All SNPs are checked for accuracy and annotated for properties specific to the SNP as well as those implied by changes to overlapping protein-coding genes. CGDSNPdb serves as the primary interface to two unique data sets, the ‘imputed genotype resource’ in which a Hidden Markov Model was used to assess local haplotypes and the most probable base assignment at several million genomic loci in tens of strains of mice, and the Affymetrix Mouse Diversity Genotyping Array, a high density microarray with over 600 000 SNPs and over 900 000 invariant genomic probes. CGDSNPdb is accessible online through either a web-based query tool or a MySQL public login

The Jackson Laboratory: The Mouseion at the JAXlibrary

PubMed Central

Carolina Digital Repository

Allele-specific copy-number discovery from whole-genome and whole-exome sequencing

Author: Crowley James J.
Sun Wei
Szatkiewicz Jin P.
Wang Wei
Wang Weibo
Publication venue
Publication date: 01/01/2014
Field of study

Copy-number variants (CNVs) are a major form of genetic variation and a risk factor for various human diseases, so it is crucial to accurately detect and characterize them. It is conceivable that allele-specific reads from high-throughput sequencing data could be leveraged to both enhance CNV detection and produce allele-specific copy number (ASCN) calls. Although statistical methods have been developed to detect CNVs using whole-genome sequence (WGS) and/or whole-exome sequence (WES) data, information from allele-specific read counts has not yet been adequately exploited. In this paper, we develop an integrated method, called AS-GENSENG, which incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data. To evaluate the performance of AS-GENSENG, we conducted extensive simulations, generated empirical data using existing WGS and WES data sets and validated predicted CNVs using an independent methodology. We conclude that AS-GENSENG not only predicts accurate ASCN calls but also improves the accuracy of total copy number calls, owing to its unique ability to exploit information from both total and allele-specific read counts while accounting for various experimental biases in sequence data. Our novel, user-friendly and computationally efficient method and a complete analytic protocol is freely available at https://sourceforge.net/projects/asgenseng/

CiteSeerX

PubMed Central

Carolina Digital Repository

eScholarship - University of California

The Recombinational Anatomy of a Mouse Chromosome

Author: Broman Karl W.
Graber Joel H.
Leahy Nicole
Ng Siemon H. S.
Paigen Kenneth
Parvanov Emil D.
Petkov Petko M.
Sawyer Kathryn
Szatkiewicz Jin P.
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Among mammals, genetic recombination occurs at highly delimited sites known as recombination hotspots. They are typically 1–2 kb long and vary as much as a 1,000-fold or more in recombination activity. Although much is known about the molecular details of the recombination process itself, the factors determining the location and relative activity of hotspots are poorly understood. To further our understanding, we have collected and mapped the locations of 5,472 crossover events along mouse Chromosome 1 arising in 6,028 meioses of male and female reciprocal F1 hybrids of C57BL/6J and CAST/EiJ mice. Crossovers were mapped to a minimum resolution of 225 kb, and those in the telomere-proximal 24.7 Mb were further mapped to resolve individual hotspots. Recombination rates were evolutionarily conserved on a regional scale, but not at the local level. There was a clear negative-exponential relationship between the relative activity and abundance of hotspot activity classes, such that a small number of the most active hotspots account for the majority of recombination. Females had 1.2× higher overall recombination than males did, although the sex ratio showed considerable regional variation. Locally, entirely sex-specific hotspots were rare. The initiation of recombination at the most active hotspot was regulated independently on the two parental chromatids, and analysis of reciprocal crosses indicated that parental imprinting has subtle effects on recombination rates. It appears that the regulation of mammalian recombination is a complex, dynamic process involving multiple factors reflecting species, sex, individual variation within species, and the properties of individual hotspots

Public Library of Science (PLOS)

The Jackson Laboratory: The Mouseion at the JAXlibrary

Directory of Open Access Journals

PubMed Central

It\u27s not the plan it\u27s the relationship : a pilot study of outcomes of person centred planning

Author: Campain Robert
Loughan Annie
Reis Peter
Szatkiewicz Sonia
Wilson Erin
Publication venue: ASSID
Publication date: 01/01/2009
Field of study

Deakin Research Online

Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation

Author: Sullivan Patrick F.
Sun Wei
Szatkiewicz Jin P.
Wang Waibo
Wang Wei
Publication venue
Publication date: 01/01/2013
Field of study

Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative data (e.g. read-depth, read-pair, split-read) need be considered, and that sophisticated methods are needed for more accurate CNV detection. We observed that various sources of experimental biases in HTS confound read-depth estimation, and note that bias correction has not been adequately addressed by existing methods. We present a novel read-depth–based method, GENSENG, which uses a hidden Markov model and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS data sets, we conclude that our method outperforms existing read-depth–based CNV detection algorithms. The concept of simultaneous bias correction and CNV detection can serve as a basis for combining read-depth with other types of information such as read-pair or split-read in a single analysis. A user-friendly and computationally efficient implementation of our method is freely available

CiteSeerX

Carolina Digital Repository

A New Method for Detecting Associations with Rare Copy-Number Variants

Author: Magnusson Patrik K. E.
Sullivan Patrick F.
Szatkiewicz Jin P.
Tzeng Jung-Ying
Publication venue
Publication date: 01/01/2015
Field of study

Copy number variants (CNVs) play an important role in the etiology of many diseases such as cancers and psychiatric disorders. Due to a modest marginal effect size or the rarity of the CNVs, collapsing rare CNVs together and collectively evaluating their effect serves as a key approach to evaluating the collective effect of rare CNVs on disease risk. While a plethora of powerful collapsing methods are available for sequence variants (e.g., SNPs) in association analysis, these methods cannot be directly applied to rare CNVs due to the CNV-specific challenges, i.e., the multi-faceted nature of CNV polymorphisms (e.g., CNVs vary in size, type, dosage, and details of gene disruption), and etiological heterogeneity (e.g., heterogeneous effects of duplications and deletions that occur within a locus or in different loci). Existing CNV collapsing analysis methods (a.k.a. the burden test) tend to have suboptimal performance due to the fact that these methods often ignore heterogeneity and evaluate only the marginal effects of a CNV feature. We introduce CCRET, a random effects test for collapsing rare CNVs when searching for disease associations. CCRET is applicable to variants measured on a multi-categorical scale, collectively modeling the effects of multiple CNV features, and is robust to etiological heterogeneity. Multiple confounders can be simultaneously corrected. To evaluate the performance of CCRET, we conducted extensive simulations and analyzed large-scale schizophrenia datasets. We show that CCRET has powerful and robust performance under multiple types of etiological heterogeneity, and has performance comparable to or better than existing methods when there is no heterogeneity

Carolina Digital Repository