13 research outputs found

    biMM : efficient estimation of genetic variances and covariances for cohorts with high-dimensional phenotype measurements

    Get PDF
    Genetic research utilizes a decomposition of trait variances and covariances into genetic and environmental parts. Our software package biMM is a computationally efficient implementation of a bivariate linear mixed model for settings where hundreds of traits have been measured on partially overlapping sets of individuals.Peer reviewe

    Next generation analytic tools for large scale genetic epidemiology studies of complex diseases

    Full text link
    Over the past several years, genome‐wide association studies (GWAS) have succeeded in identifying hundreds of genetic markers associated with common diseases. However, most of these markers confer relatively small increments of risk and explain only a small proportion of familial clustering. To identify obstacles to future progress in genetic epidemiology research and provide recommendations to NIH for overcoming these barriers, the National Cancer Institute sponsored a workshop entitled “Next Generation Analytic Tools for Large‐Scale Genetic Epidemiology Studies of Complex Diseases” on September 15–16, 2010. The goal of the workshop was to facilitate discussions on (1) statistical strategies and methods to efficiently identify genetic and environmental factors contributing to the risk of complex disease; and (2) how to develop, apply, and evaluate these strategies for the design, analysis, and interpretation of large‐scale complex disease association studies in order to guide NIH in setting the future agenda in this area of research. The workshop was organized as a series of short presentations covering scientific (gene‐gene and gene‐environment interaction, complex phenotypes, and rare variants and next generation sequencing) and methodological (simulation modeling and computational resources and data management) topic areas. Specific needs to advance the field were identified during each session and are summarized. Genet. Epidemiol . 36 : 22–35, 2012. © 2011 Wiley Periodicals, Inc.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/93578/1/gepi20652.pd

    Assessing the genetic overlap between BMI and cognitive function

    Get PDF
    Obesity and low cognitive function are associated with multiple adverse health outcomes across the life course. They have a small phenotypic correlation (r=-0.11; high body mass index (BMI)-low cognitive function), but whether they have a shared genetic aetiology is unknown. We investigated the phenotypic and genetic correlations between the traits using data from 6815 unrelated, genotyped members of Generation Scotland, an ethnically homogeneous cohort from five sites across Scotland. Genetic correlations were estimated using the following: same-sample bivariate genome-wide complex trait analysis (GCTA)-GREML; independent samples bivariate GCTA-GREML using Generation Scotland for cognitive data and four other samples (n=20 806) for BMI; and bivariate LDSC analysis using the largest genome-wide association study (GWAS) summary data on cognitive function (n=48 462) and BMI (n=339 224) to date. The GWAS summary data were also used to create polygenic scores for the two traits, with within- and cross-trait prediction taking place in the independent Generation Scotland cohort. A large genetic correlation of -0.51 (s.e. 0.15) was observed using the same-sample GCTA-GREML approach compared with -0.10 (s.e. 0.08) from the independent-samples GCTA-GREML approach and -0.22 (s.e. 0.03) from the bivariate LDSC analysis. A genetic profile score using cognition-specific genetic variants accounts for 0.08% (P=0.020) of the variance in BMI and a genetic profile score using BMI-specific variants accounts for 0.42% (P=1.9 × 10 -7) of the variance in cognitive function. Seven common genetic variants are significantly associated with both traits at

    REHH 2.0: a reimplementation of the R package REHH to detect positive selection from haplotype structure

    Get PDF
    Identifying genomic regions with unusually high local haplotype homozygosity represents a powerful strategy to characterize candidate genes responding to natural or artificial positive selection. To that end, statistics measuring the extent of haplotype homozygosity within (e.g. EHH, iHS) and between (Rsb or XP-EHH) populations have been proposed in the literature. The REHH package for R was previously developed to facilitate genome-wide scans of selection, based on the analysis of long-range haplotypes. However, its performance was not sufficient to cope with the growing size of available data sets. Here, we propose a major upgrade of the REHH package, which includes an improved processing of the input files, a faster algorithm to enumerate haplotypes, as well as multithreading. As illustrated with the analysis of large human haplotype data sets, these improvements decrease the computation time by more than one order of magnitude. This new version of REHH will thus allow performing iHS-, Rsb-or XP-EHH-based scans on large data sets. The package REHH 2.0 is available from the CRAN repository (http://cran.r-project.org/web/packages/rehh/index.html) together with help files and a detailed manual

    Imputation of low-frequency variants using the HapMap3 benefits from large, diverse reference sets

    No full text
    Imputation allows the inference of unobserved genotypes in low-density data sets, and is often used to test for disease association at variants that are poorly captured by standard genotyping chips (such as low-frequency variants). Although much effort has gone into developing the best imputation algorithms, less is known about the effects of reference set choice on imputation accuracy. We assess the improvements afforded by increases in reference size and diversity, specifically comparing the HapMap2 data set, which has been used to date for imputation, and the new HapMap3 data set, which contains more samples from a more diverse range of populations. We find that, for imputation into Western European samples, the HapMap3 reference provides more accurate imputation with better-calibrated quality scores than HapMap2, and that increasing the number of HapMap3 populations included in the reference set grant further improvements. Improvements are most pronounced for low-frequency variants (frequency <5%), with the largest and most diverse reference sets bringing the accuracy of imputation of low-frequency variants close to that of common ones. For low-frequency variants, reference set diversity can improve the accuracy of imputation, independent of reference sample size. HapMap3 reference sets provide significant increases in imputation accuracy relative to HapMap2, and are of particular use if highly accurate imputation of low-frequency variants is required. Our results suggest that, although the sample sizes from the 1000 Genomes Pilot Project will not allow reliable imputation of low-frequency variants, the larger sample sizes of the main project will allow

    Overview of the human genome

    No full text
    The human genome is composed of deoxyribonucleic acid (DNA) organized into 23 pairs of chromosomes in the nucleus of human cells, as well as the small DNA found inside individual mitochondria. Complete sequencing of the 3 billion base pairs that make up the human genome has made available a deluge of information that has enhanced our understanding of evolution, physiology, causality of disease, and association between heredity and environment in humans. This chapter discusses discoveries in genetics that spawned the field of human genomics. It further highlights the role of human genome in disease susceptibility, as well as its prospects for the future of healthcare
    corecore