Search CORE

Rapid and robust association mapping of expression quantitative trait loci

Author: Aulchenko Yurii S
de Koning Dirk-Jan
Haley Chris S
Lam Alex C
Schouten Michael
Publication venue
Publication date: 01/01/2007
Field of study

We applied a simple and efficient two-step method to analyze a family-based association study of gene expression quantitative trait loci (eQTL) in a mixed model framework. This two-step method produces very similar results to the full mixed model method, with our method being significantly faster than the full model. Using the Genetic Analysis Workshop 15 (GAW15) Problem 1 data, we demonstrated the value of data filtering for reducing the number of tests and controlling the number of false positives. Specifically, we showed that removing non-expressed genes by filtering on expression variability effectively reduced the number of tests by nearly 50%. Furthermore, we demonstrated that filtering on genotype counts substantially reduced spurious detection. Finally, we restricted our analysis to the markers and transcripts that were closely located. We found five times more signals in close proximity (cis-) to transcripts than in our genome-wide analysis. Our results suggest that careful pre-filtering and partitioning of data are crucial for controlling false positives and allowing detection of genuine effects in genetic analysis of gene expression

Edinburgh Research Explorer

The Empirical Power of Rare Variant Association Methods: Results from Sanger Sequencing in 1,998 Individuals

Author: Aulchenko Yurii S.
Dastani Zari
Greenwood Celia M. T.
Ladouceur Martin
Richards J. Brent
Publication venue: Public Library of Science
Publication date: 01/02/2012
Field of study

The role of rare genetic variation in the etiology of complex disease remains unclear. However, the development of next-generation sequencing technologies offers the experimental opportunity to address this question. Several novel statistical methodologies have been recently proposed to assess the contribution of rare variation to complex disease etiology. Nevertheless, no empirical estimates comparing their relative power are available. We therefore assessed the parameters that influence their statistical power in 1,998 individuals Sanger-sequenced at seven genes by modeling different distributions of effect, proportions of causal variants, and direction of the associations (deleterious, protective, or both) in simulated continuous trait and case/control phenotypes. Our results demonstrate that the power of recently proposed statistical methods depend strongly on the underlying hypotheses concerning the relationship of phenotypes with each of these three factors. No method demonstrates consistently acceptable power despite this large sample size, and the performance of each method depends upon the underlying assumption of the relationship between rare variants and complex traits. Sensitivity analyses are therefore recommended to compare the stability of the results arising from different methods, and promising results should be replicated using the same method in an independent sample. These findings provide guidance in the analysis and interpretation of the role of rare base-pair variation in the etiology of complex traits and diseases

Public Library of Science (PLOS)

Edinburgh Research Explorer

eScholarship@McGill

FigShare

PheLiGe:an interactive database of billions of human genotype-phenotype associations

Author: Aulchenko Yurii S
Gorev Denis D
Joshi Peter K
Karssen Lennart C
Pakhomov Eugene D
Shashkova Tatiana I
Publication venue: 'Oxford University Press (OUP)'
Publication date: 27/11/2020
Field of study

Rapid and robust association mapping of expression quantitative trait loci

Author: Alex C Lam
Chris S Haley
Dirk-Jan de Koning
E Boerwinkle
GR Abecasis
JD Storey
JD Storey
M Morley
Michael Schouten
T Pastinen
YS Aulchenko
Yurii S Aulchenko
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

IST Austria: PubRep (Institute of Science and Technology)

The limits of normal approximation for adult height

Author: Aulchenko Yurii S.
Axenovich Tatiana I.
Bazykin Georgii A.
Kondrashov Fyodor
Kuznetsov Ivan A.
Shashkova Tatiana I.
Slavskii Sergei A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Adult height inspired the first biometrical and quantitative genetic studies and is a test-case trait for understanding heritability. The studies of height led to formulation of the classical polygenic model, that has a profound influence on the way we view and analyse complex traits. An essential part of the classical model is an assumption of additivity of effects and normality of the distribution of the residuals. However, it may be expected that the normal approximation will become insufficient in bigger studies. Here, we demonstrate that when the height of hundreds of thousands of individuals is analysed, the model complexity needs to be increased to include non-additive interactions between sex, environment and genes. Alternatively, the use of log-normal approximation allowed us to still use the additive effects model. These findings are important for future genetic and methodologic studies that make use of adult height as an exemplar trait

PredictABEL: an R package for the assessment of risk prediction models

Author: A. Cecile J. W. Janssens
AC Janssens
Cornelia M. van Duijn
DW Hosmer
EW Steyerberg
JA Hanley
JM Seddon
K McGeechan
MA Hlatky
MJ Khoury
MJ Pencina
NJ Nagelkerke
NR Cook
Suman Kundu
YS Aulchenko
YS Aulchenko
Yurii S. Aulchenko
Publication venue: Springer Netherlands
Publication date: 01/01/2011
Field of study

The rapid identification of genetic markers for multifactorial diseases from genome-wide association studies is fuelling interest in investigating the predictive ability and health care utility of genetic risk models. Various measures are available for the assessment of risk prediction models, each addressing a different aspect of performance and utility. We developed PredictABEL, a package in R that covers descriptive tables, measures and figures that are used in the analysis of risk prediction studies such as measures of model fit, predictive ability and clinical utility, and risk distributions, calibration plot and the receiver operating characteristic plot. Tables and figures are saved as separate files in a user-specified format, which include publication-quality EPS and TIFF formats. All figures are available in a ready-made layout, but they can be customized to the preferences of the user. The package has been developed for the analysis of genetic risk prediction studies, but can also be used for studies that only include non-genetic risk factors. PredictABEL is freely available at the websites of GenABEL (http://www.genabel.org) and CRAN (http://cran.r-project.org/)

Springer - Publisher Connector

EUR Research Repository

ParallABEL: an R library for generalized parallelization of genome-wide association studies

Author: F Dudbridge
G Vera
H Mishima
J Hill
K Misawa
L Ma
LA Hindorff
NM Laird
Pichaya Tandayya
R Ihaka
RM Plenge
Surakameth Mahasirimongkol
TA Pearson
Unitsa Sangket
Wasun Chantratita
YS Aulchenko
Yurii S Aulchenko
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files.Results: Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was linearly reduced from approximately eight hours to one hour when ParallABEL employed eight processors.Conclusions: Executing genome-wide association analysis using the ParallABEL library on a computer cluster is an effective way to boost performance, and simplify the parallelization of GWA studies. ParallABEL is a user-friendly parallelization of GenABEL

Springer - Publisher Connector

Public Library of Science (PLOS)

Association between Type 2 Diabetes Loci and Measures of Fatness

Author: AL Gloyn
Ben A. Oostra
Cornelia M. van Duijn
D Altshuler
E Zeggini
E Zeggini
JE Cecil
LJ Scott
LM Pardo
M Ghoussaini
M. Carola Zillikens
Michael Nicholas Weedon
Peter Henneman
Pieter J. Snijders
PJ Campbell
R Sladek
S Cauchi
S Wild
SA Bacanu
SF Grant
Slavica Pecioska
SS Deeb
TM Frayling
YS Aulchenko
YS Aulchenko
Yurii S. Aulchenko
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Background: Type 2 diabetes (T2D) is a metabolic disorder characterized by disturbances of carbohydrate, fat and protein metabolism and insulin resistance. The majority of T2D patients are obese and obesity by itself may be a cause of insulin resistance. Our aim was to evaluate whether the recently identified T2D risk alleles are associated with human measures of fatness as characterized with Dual Energy X-ray Absorptiometry (DEXA). Methodology/Principal Findings: Genotypes and phenotypes of approximately 3,000 participants from cross-sectional ERF study were analyzed. Nine single nucleotide polymorphisms (SNPs) in CDKN2AB, CDKAL1, FTO, HHEX, IGF2BP2, KCNJ11, PPARG, SLC30A8 and TCF7L2 were genotyped. We used linear regression to study association between individual SNPs and the combined allelic risk score with body mass index (BMI), fat mass index (FMI), fat percentage (FAT), waist circumference (WC) and waist to hip ratio (WHR). Significant association was observed between rs8050136 (FTO) and BMI (p = 0.003), FMI (p = 0.007) and WC (p = 0.03); fat percentage was borderline significant (p = 0.053). No other SNPs alone or combined in a risk score demonstrated significant association to the measures of fatness. Conclusions/Significance: From the recently identified T2D risk variants only the risk variant of the FTO gene (rs8050136) showed statistically significant association with BMI, FMI, and WC

CiteSeerX

EUR Research Repository