3,654 research outputs found
The Role of Family-Based Designs in Genome-Wide Association Studies
Genome-Wide Association Studies (GWAS) offer an exciting and promising new
research avenue for finding genes for complex diseases. Traditional
case-control and cohort studies offer many advantages for such designs.
Family-based association designs have long been attractive for their robustness
properties, but robustness can mean a loss of power. In this paper we discuss
some of the special features of family designs and their relevance in the era
of GWAS.Comment: Published in at http://dx.doi.org/10.1214/08-STS280 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Quantifying the Fraction of Missing Information for Hypothesis Testing in Statistical and Genetic Studies
Many practical studies rely on hypothesis testing procedures applied to data
sets with missing information. An important part of the analysis is to
determine the impact of the missing data on the performance of the test, and
this can be done by properly quantifying the relative (to complete data) amount
of available information. The problem is directly motivated by applications to
studies, such as linkage analyses and haplotype-based association projects,
designed to identify genetic contributions to complex diseases. In the genetic
studies the relative information measures are needed for the experimental
design, technology comparison, interpretation of the data, and for
understanding the behavior of some of the inference tools. The central
difficulties in constructing such information measures arise from the multiple,
and sometimes conflicting, aims in practice. For large samples, we show that a
satisfactory, likelihood-based general solution exists by using appropriate
forms of the relative Kullback--Leibler information, and that the proposed
measures are computationally inexpensive given the maximized likelihoods with
the observed data. Two measures are introduced, under the null and alternative
hypothesis respectively. We exemplify the measures on data coming from mapping
studies on the inflammatory bowel disease and diabetes. For small-sample
problems, which appear rather frequently in practice and sometimes in disguised
forms (e.g., measuring individual contributions to a large study), the robust
Bayesian approach holds great promise, though the choice of a general-purpose
"default prior" is a very challenging problem.Comment: Published in at http://dx.doi.org/10.1214/07-STS244 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Population Structure and Cryptic Relatedness in Genetic Association Studies
We review the problem of confounding in genetic association studies, which
arises principally because of population structure and cryptic relatedness.
Many treatments of the problem consider only a simple ``island'' model of
population structure. We take a broader approach, which views population
structure and cryptic relatedness as different aspects of a single confounder:
the unobserved pedigree defining the (often distant) relationships among the
study subjects. Kinship is therefore a central concept, and we review methods
of defining and estimating kinship coefficients, both pedigree-based and
marker-based. In this unified framework we review solutions to the problem of
population structure, including family-based study designs, genomic control,
structured association, regression control, principal components adjustment and
linear mixed models. The last solution makes the most explicit use of the
kinships among the study subjects, and has an established role in the analysis
of animal and plant breeding studies. Recent computational developments mean
that analyses of human genetic association data are beginning to benefit from
its powerful tests for association, which protect against population structure
and cryptic kinship, as well as intermediate levels of confounding by the
pedigree.Comment: Published in at http://dx.doi.org/10.1214/09-STS307 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Unsupervised empirical Bayesian multiple testing with external covariates
In an empirical Bayesian setting, we provide a new multiple testing method,
useful when an additional covariate is available, that influences the
probability of each null hypothesis being true. We measure the posterior
significance of each test conditionally on the covariate and the data, leading
to greater power. Using covariate-based prior information in an unsupervised
fashion, we produce a list of significant hypotheses which differs in length
and order from the list obtained by methods not taking covariate-information
into account. Covariate-modulated posterior probabilities of each null
hypothesis are estimated using a fast approximate algorithm. The new method is
applied to expression quantitative trait loci (eQTL) data.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS158 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Tests for High Dimensional Generalized Linear Models
We consider testing regression coefficients in high dimensional generalized
linear models. An investigation of the test of Goeman et al. (2011) is
conducted, which reveals that if the inverse of the link function is unbounded,
the high dimensionality in the covariates can impose adverse impacts on the
power of the test. We propose a test formation which can avoid the adverse
impact of the high dimensionality. When the inverse of the link function is
bounded such as the logistic or probit regression, the proposed test is as good
as Goeman et al. (2011)'s test. The proposed tests provide p-values for testing
significance for gene-sets as demonstrated in a case study on an acute
lymphoblastic leukemia dataset.Comment: The research paper was stole by someone last November and illegally
submitted to arXiv by a person named gong zi jiang nan. We have asked arXiv
to withdraw the unfinished paper [arXiv:1311.4043] and it was removed last
December. We have collected enough evidences to identify the person and
Peking University has begun to investigate the plagiarize
Meta-Analysis of Gene Level Tests for Rare Variant Association
The vast majority of connections between complex disease and common genetic variants were identified through meta-analysis, a powerful approach that enables large sample sizes while protecting against common artifacts due to population structure, repeated small sample analyses, and/or limitations with sharing individual level data. As the focus of genetic association studies shifts to rare variants, genes and other functional units are becoming the unit of analysis. Here, we propose and evaluate new approaches for performing meta-analysis of rare variant association tests, including burden tests, weighted burden tests, variable threshold tests and tests that allow variants with opposite effects to be grouped together. We show that our approach retains useful features of single variant meta-analytic approaches and demonstrate its utility in a study of blood lipid levels in ∼18,500 individuals genotyped with exome arrays
- …