Search CORE

15 research outputs found

Recommended from our members

PolyGEE: a generalized estimating equation approach to the efficient and robust estimation of polygenic effects in large-scale association studies

Author: Fier Heide Loehlein
Hecker Julian
Lange Christoph
Prokopenko Dmitry
Publication venue: 'Oxford University Press (OUP)'
Publication date: 25/07/2018
Field of study

SUMMARY To quantify polygenic effects, i.e. undetected genetic effects, in large-scale association studies, we propose a generalized estimating equation (GEE) based estimation framework. We develop a marginal model for single-variant association test statistics of complex diseases that generalizes existing approaches such as LD Score regression and that is applicable to population-based designs, to family-based designs or to arbitrary combinations of both. We extend the standard GEE approach so that the parameters of the proposed marginal model can be estimated based on working-correlation/linkage-disequilibrium (LD) matrices from external reference panels. Our method achieves substantial efficiency gains over standard approaches, while it is robust against misspecification of the LD structure, i.e. the LD structure of the reference panel can differ substantially from the true LD structure in the study population. In simulation studies and in applications to population-based and family-based studies, we illustrate the features of the proposed GEE framework. Our results suggest that our approach can be up to 100% more efficient than existing methodology

Harvard University - DASH

Reporting Correct p

Author: Anna Maaser
Christoph Lange
Dmitry Prokopenko
Heide Loehlein Fier
Julian Hecker
Publication venue: 'Cambridge University Press (CUP)'
Publication date
Field of study

Crossref

Recommended from our members

Using Network Methodology to Infer Population Substructure

Author: Hecker Julian
Lange Christoph
Loehlein Fier Heide
Nöthen Markus M.
Prokopenko Dmitry
Schmid Matthias
Silverman Edwin
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 13/07/2015
Field of study

One of the main caveats of association studies is the possible affection by bias due to population stratification. Existing methods rely on model-based approaches like structure and ADMIXTURE or on principal component analysis like EIGENSTRAT. Here we provide a novel visualization technique and describe the problem of population substructure from a graph-theoretical point of view. We group the sequenced individuals into triads, which depict the relational structure, on the basis of a predefined pairwise similarity measure. We then merge the triads into a network and apply community detection algorithms in order to identify homogeneous subgroups or communities, which can further be incorporated as covariates into logistic regression. We apply our method to populations from different continents in the 1000 Genomes Project and evaluate the type 1 error based on the empirical p-values. The application to 1000 Genomes data suggests that the network approach provides a very fine resolution of the underlying ancestral population structure. Besides we show in simulations, that in the presence of discrete population structures, our developed approach maintains the type 1 error more precisely than existing approaches

Harvard University - DASH

Directory of Open Access Journals

FigShare

Utilizing the Jaccard index to reveal population stratification in sequencing data: a simulation study and an application to the 1000 Genomes Project

Author: Dina Christian
Fier Heide Loehlein
Hecker Julian
Lange Christoph
Nöthen Markus M.
Pagano Marcello
Prokopenko Dmitry
Silverman Edwin K.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/05/2016
Field of study

International audienceMOTIVATION: Population stratification is one of the major sources of confounding in genetic association studies, potentially causing false-positive and false-negative results. Here, we present a novel approach for the identification of population substructure in high-density genotyping data/next generation sequencing data. The approach exploits the co-appearances of rare genetic variants in individuals. The method can be applied to all available genetic loci and is computationally fast. Using sequencing data from the 1000 Genomes Project, the features of the approach are illustrated and compared to existing methodology (i.e. EIGENSTRAT). We examine the effects of different cutoffs for the minor allele frequency on the performance of the approach. We find that our approach works particularly well for genetic loci with very small minor allele frequencies. The results suggest that the inclusion of rare-variant data/sequencing data in our approach provides a much higher resolution picture of population substructure than it can be obtained with existing methodology. Furthermore, in simulation studies, we find scenarios where our method was able to control the type 1 error more precisely and showed higher power. AVAILABILITY AND IMPLEMENTATION: CONTACT: [email protected] SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

Utilizing the Jaccard index to reveal population stratification in sequencing data: a simulation study and an application to the 1000 Genomes Project

Author: Baye
Christian Dina
Christoph Lange
Dmitry Prokopenko
Edwin K. Silverman
Heide Loehlein Fier
Jaccard
Julian Hecker
Lee
Marcello Pagano
Markus M. Nöthen
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

5 European subpopulations.

Author: Christoph Lange (213285)
Dmitry Prokopenko (758790)
Edwin Silverman (758792)
Heide Loehlein Fier (758793)
Julian Hecker (758791)
Markus M. Nöthen (160934)
Matthias Schmid (51150)
Publication venue
Publication date
Field of study

The polygons around the nodes represent the detected communities. The node colors represent the actual labels.</p

FigShare

Description of datasets, used in the analysis.

Author: Christoph Lange (213285)
Dmitry Prokopenko (758790)
Edwin Silverman (758792)
Heide Loehlein Fier (758793)
Julian Hecker (758791)
Markus M. Nöthen (160934)
Matthias Schmid (51150)
Publication venue
Publication date
Field of study

Description of datasets, used in the analysis.</p

FigShare

3 American subpopulations.

Author: Christoph Lange (213285)
Dmitry Prokopenko (758790)
Edwin Silverman (758792)
Heide Loehlein Fier (758793)
Julian Hecker (758791)
Markus M. Nöthen (160934)
Matthias Schmid (51150)
Publication venue
Publication date
Field of study

The polygons around the nodes represent the detected communities. The node colors represent the actual labels.</p

FigShare

3 African subpopulations.

Author: Christoph Lange (213285)
Dmitry Prokopenko (758790)
Edwin Silverman (758792)
Heide Loehlein Fier (758793)
Julian Hecker (758791)
Markus M. Nöthen (160934)
Matthias Schmid (51150)
Publication venue
Publication date
Field of study

The polygons around the nodes represent the detected communities. The node colors represent the actual labels.</p

FigShare

Contingency table for American subpopulations, rows correspond to detected communities, columns to actual subpopulations.

Author: Christoph Lange (213285)
Dmitry Prokopenko (758790)
Edwin Silverman (758792)
Heide Loehlein Fier (758793)
Julian Hecker (758791)
Markus M. Nöthen (160934)
Matthias Schmid (51150)
Publication venue
Publication date
Field of study

PUR—Puerto Rican, CLM—Colombian, MXL–MexicanContingency table for American subpopulations, rows correspond to detected communities, columns to actual subpopulations.</p

FigShare