Search CORE

813 research outputs found

Recommended from our members

Block-based Bayesian epistasis association mapping with application to WTCCC type 1 diabetes data

Author: Liu Jun
Zhang Jing
Zhang Yu
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 26/08/2014
Field of study

Interactions among multiple genes across the genome may contribute to the risks of many complex human diseases. Whole-genome single nucleotide polymorphisms (SNPs) data collected for many thousands of SNP markers from thousands of individuals under the case–control design promise to shed light on our understanding of such interactions. However, nearby SNPs are highly correlated due to linkage disequilibrium (LD) and the number of possible interactions is too large for exhaustive evaluation. We propose a novel Bayesian method for simultaneously partitioning SNPs into LD-blocks and selecting SNPs within blocks that are associated with the disease, either individually or interactively with other SNPs. When applied to homogeneous population data, the method gives posterior probabilities for LD-block boundaries, which not only result in accurate block partitions of SNPs, but also provide measures of partition uncertainty. When applied to case–control data for association mapping, the method implicitly filters out SNP associations created merely by LD with disease loci within the same blocks. Simulation study showed that this approach is more powerful in detecting multi-locus associations than other methods we tested, including one of ours. When applied to the WTCCC type 1 diabetes data, the method identified many previously known T1D associated genes, including PTPN22, CTLA4, MHC, and IL2RA. The method also revealed some interesting two-way associations that are undetected by single SNP methods. Most of the significant associations are located within the MHC region. Our analysis showed that the MHC SNPs form long-distance joint associations over several known recombination hotspots. By controlling the haplotypes of the MHC class II region, we identified additional associations in both MHC class I (HLA-A, HLA-B) and class III regions (BAT1). We also observed significant interactions between genes PRSS16, ZNF184 in the extended MHC region and the MHC class II genes. The proposed method can be broadly applied to the classification problem with correlated discrete covariates.Statistic

Harvard University - DASH

Epistatic Module Detection for Case-Control Studies: A Bayesian Model with a Gibbs Sampling Strategy

Author: A DeWan
AI Su
AJ Lotery
AL Dixon
BHF Weber
C-T Tsai
CE Pearson
D Botstein
DE Weeks
DE Weeks
DW Schultz
EM Stone
G Gambano
G Jun
GM Clinton
H Ishwaran
H Jason
HC Fung
HJ Cordell
HJ Cordell
J Hoh
J Marchini
J Millstein
J Schick
J Simon-Sanchez
J Tuo
K Ronald
L Fu
L Kruglyak
L Tiret
LHY Marmorstein
LR Cardon
M Ritchie
M Weigell-Weber
M Wollenhaupt
MP Martin
MR Nelson
N Chatterjee
N Risch
NE Morton
Nicholas J. Schork
NJ Risch
NV Lee
PJ Green
R Culverhouse
R Neal
RA Draviam
RJ Klein
RJ Klein
Rui Jiang
S Grau
S Iyengar
S Jain
SM Williams
T Niu
V Laura
VM Dufault
Wanwan Tang
WY Tsang
Xuebing Wu
Y Zhang
Yanda Li
YM Cho
ZL Yang
Publication venue: Public Library of Science
Publication date: 01/05/2009
Field of study

The detection of epistatic interactive effects of multiple genetic variants on the susceptibility of human complex diseases is a great challenge in genome-wide association studies (GWAS). Although methods have been proposed to identify such interactions, the lack of an explicit definition of epistatic effects, together with computational difficulties, makes the development of new methods indispensable. In this paper, we introduce epistatic modules to describe epistatic interactive effects of multiple loci on diseases. On the basis of this notion, we put forward a Bayesian marker partition model to explain observed case-control data, and we develop a Gibbs sampling strategy to facilitate the detection of epistatic modules. Comparisons of the proposed approach with three existing methods on seven simulated disease models demonstrate the superior performance of our approach. When applied to a genome-wide case-control data set for Age-related Macular Degeneration (AMD), the proposed approach successfully identifies two known susceptible loci and suggests that a combination of two other loci—one in the gene SGCD and the other in SCAPER—is associated with the disease. Further functional analysis supports the speculation that the interaction of these two genetic variants may be responsible for the susceptibility of AMD. When applied to a genome-wide case-control data set for Parkinson's disease, the proposed method identifies seven suspicious loci that may contribute independently to the disease

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Integrated phenotypes:understanding trait covariation in plants and animals

Author: Armbruster Scott
Bolstad Geir H.
Hansen Thomas F.
Pelabon Christophe
Publication venue: 'The Royal Society'
Publication date: 01/08/2014
Field of study

Portsmouth University Research Portal (Pure)

Reliable confidence intervals in quantitative genetics: narrow-sense heritability

Author: Davison Anthony
Fabbro Thomas
Steinger Thomas
Publication venue
Publication date: 18/06/2018
Field of study

Many quantitative genetic statistics are functions of variance components, for which a large number of replicates is needed for precise estimates and reliable measures of uncertainty, on which sound interpretation depends. Moreover, in large experiments the deaths of some individuals can occur, so methods for analysing such data need to be robust to missing values. We show how confidence intervals for narrow-sense heritability can be calculated in a nested full-sib/half-sib breeding design (males crossed with several females) in the presence of missing values. Simulations indicate that the method provides accurate results, and that estimator uncertainty is lowest for sampling designs with many males relative to the number of females per male, and with more females per male than progenies per female. Missing data generally had little influence on estimator accuracy, thus suggesting that the overall number of observations should be increased even if this results in unbalanced data. We also suggest the use of parametrically simulated data for prior investigation of the accuracy of planned experiments. Together with the proposed confidence intervals an informed decision on the optimal sampling design is possible, which allows efficient allocation of resource

RERO DOC Digital Library

Temporal and genomic analysis of additive genetic variance in breeding programmes

Author: De Castro Lara Leticia
De Paula Oliveira Thiago
Gaynor Chris
Gorjanc Gregor
Pocrnic Ivan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/12/2021
Field of study

Genetic variance is a central parameter in quantitative genetics and breeding. Assessing changes in genetic variance over time as well as the genome is therefore of high interest. Here, we extend a previously proposed framework for temporal analysis of genetic variance using the pedigree-based model, to a new framework for temporal and genomic analysis of genetic variance using marker-based models. To this end, we describe the theory of partitioning genetic variance into genic variance and within-chromosome and between-chromosome linkage-disequilibrium, and how to estimate these variance components from a marker-based model fitted to observed phenotype and marker data. The new framework involves three steps: (i) fitting a marker-based model to data, (ii) sampling realisations of marker effects from the fitted model and for each sample calculating realisations of genetic values and (iii) calculating the variance of sampled genetic values by time and genome partitions. Analysing time partitions indicates breeding programme sustainability, while analysing genome partitions indicates contributions from chromosomes and chromosome pairs and linkage-disequilibrium. We demonstrate the framework with a simulated breeding programme involving a complex trait. Results show good concordance between simulated and estimated variances, provided that the fitted model is capturing genetic complexity of a trait. We observe a reduction of genetic variance due to selection and drift changing allele frequencies, and due to selection inducing negative linkage-disequilibrium

PubMed Central

Edinburgh Research Explorer

Repository of the University of Ljubljana

Phantom epistasis between unlinked loci

Author: Esko Tonu
Franke Lude
Gibson Greg
Goddard Michael E.
Hemani Gibran
Henders Anjali K.
Martin Nicholas G.
McRae Allan F.
Metspalu Andres
Montgomery Grant W.
Powell Joseph E.
Shakhbazov Konstantin
Visscher Peter M.
Wang Huanwei
Westra Harm-Jan
Yang Jian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2021
Field of study

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Contrasting multi-site genotypic distributions among discordant quantitative phenotypes: the APOA1/C3/A4/A5 gene cluster and cardiovascular disease risk factors

Author: Boerwinkle Eric
Clark Andrew G.
Hixson James E.
Payseur Bret A.
Sing Charles F.
Publication venue: 'Wiley'
Publication date: 01/09/2006
Field of study

Most tests of association between DNA sequence variation and quantitative phenotypes in samples of randomly chosen individuals rely on specification of genotypic strata followed by comparison of phenotypes across these strata. This strategy often succeeds when phenotypic differences are caused by one or two single nucleotide polymorphisms (SNPs) among the surveyed markers. However, when multiple-SNP haplotypes account for observed phenotypic variation, identification of the best partitioning requires examination of an inordinate number of SNP combinations. An alternative approach is to rank individuals by their phenotypic measures and ask whether attributes of the genotypic variation show a non-random distribution along this phenotypic ranking. One simple version of this strategy selects the top and bottom tails of the distribution, and then tests whether genotypes from these two samples are drawn from a single population. This framework does not require the recovery of phased haplotypes and allows contrasts between large numbers of sites at once. We use a method based on this approach to identify associations between plasma triglyceride level, a risk factor for cardiovascular disease, and multi-site genotypes located in the APOA1/C3/A4/A5 cluster of apolipoprotein genes in unrelated individuals (1,071 African-American females, 780 African-American males, 1,036 European-American females, and 930 European-American males) sampled from four US cities as part of the Coronary Artery Risk Development in Young Adults (CARDIA) study. Method performance is investigated using simulations that model genealogical variation and different genetic architectures. Results indicate that this multi-site test can identify genotype-phenotype associations with reasonable power, including those generated by some simple epistatic models. Genet. Epidemiol . 2006. © 2006 Wiley-Liss, Inc.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/55790/1/20163_ftp.pd

Crossref

Deep Blue Documents at the University of Michigan

A hierarchical Bayesian model for inference of copy number variants and their association to gene expression

Author: Cassese Alberto
Falciani Francesco
Guindani Michele
Tadesse Mahlet G.
Vannucci Marina
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2014
Field of study

A number of statistical models have been successfully developed for the analysis of high-throughput data from a single source, but few methods are available for integrating data from different sources. Here we focus on integrating gene expression levels with comparative genomic hybridization (CGH) array measurements collected on the same subjects. We specify a measurement error model that relates the gene expression levels to latent copy number states which, in turn, are related to the observed surrogate CGH measurements via a hidden Markov model. We employ selection priors that exploit the dependencies across adjacent copy number states and investigate MCMC stochastic search techniques for posterior inference. Our approach results in a unified modeling framework for simultaneously inferring copy number variants (CNV) and identifying their significant associations with mRNA transcripts abundance. We show performance on simulated data and illustrate an application to data from a genomic study on human cancer cell lines.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS705 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Maastricht University Research Portal

Florence Research

PubMed Central

eScholarship - University of California

DSpace at Rice University