35,475 research outputs found
Recommended from our members
Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes.
We aggregated coding variant data for 81,412 type 2 diabetes cases and 370,832 controls of diverse ancestry, identifying 40 coding variant association signals (P < 2.2 × 10-7); of these, 16 map outside known risk-associated loci. We make two important observations. First, only five of these signals are driven by low-frequency variants: even for these, effect sizes are modest (odds ratio ≤1.29). Second, when we used large-scale genome-wide association data to fine-map the associated variants in their regional context, accounting for the global enrichment of complex trait associations in coding sequence, compelling evidence for coding variant causality was obtained for only 16 signals. At 13 others, the associated coding variants clearly represent 'false leads' with potential to generate erroneous mechanistic inference. Coding variant associations offer a direct route to biological insight for complex diseases and identification of validated therapeutic targets; however, appropriate mechanistic inference requires careful specification of their causal contribution to disease predisposition
Deep-coverage whole genome sequences and blood lipids among 16,324 individuals.
Large-scale deep-coverage whole-genome sequencing (WGS) is now feasible and offers potential advantages for locus discovery. We perform WGS in 16,324 participants from four ancestries at mean depth >29X and analyze genotypes with four quantitative traits-plasma total cholesterol, low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol, and triglycerides. Common variant association yields known loci except for few variants previously poorly imputed. Rare coding variant association yields known Mendelian dyslipidemia genes but rare non-coding variant association detects no signals. A high 2M-SNP LDL-C polygenic score (top 5th percentile) confers similar effect size to a monogenic mutation (~30 mg/dl higher for each); however, among those with severe hypercholesterolemia, 23% have a high polygenic score and only 2% carry a monogenic mutation. At these sample sizes and for these phenotypes, the incremental value of WGS for discovery is limited but WGS permits simultaneous assessment of monogenic and polygenic models to severe hypercholesterolemia
Recommended from our members
MPRAnalyze: statistical framework for massively parallel reporter assays.
Massively parallel reporter assays (MPRAs) can measure the regulatory function of thousands of DNA sequences in a single experiment. Despite growing popularity, MPRA studies are limited by a lack of a unified framework for analyzing the resulting data. Here we present MPRAnalyze: a statistical framework for analyzing MPRA count data. Our model leverages the unique structure of MPRA data to quantify the function of regulatory sequences, compare sequences' activity across different conditions, and provide necessary flexibility in an evolving field. We demonstrate the accuracy and applicability of MPRAnalyze on simulated and published data and compare it with existing methods
Rare Copy Number Variants in \u3cem\u3eNRXN1\u3c/em\u3e and \u3cem\u3eCNTN6\u3c/em\u3e Increase Risk for Tourette Syndrome
Tourette syndrome (TS) is a model neuropsychiatric disorder thought to arise from abnormal development and/or maintenance of cortico-striato-thalamo-cortical circuits. TS is highly heritable, but its underlying genetic causes are still elusive, and no genome-wide significant loci have been discovered to date. We analyzed a European ancestry sample of 2,434 TS cases and 4,093 ancestry-matched controls for rare (\u3c 1% frequency) copy-number variants (CNVs) using SNP microarray data. We observed an enrichment of global CNV burden that was prominent for large (\u3e 1 Mb), singleton events (OR = 2.28, 95% CI [1.39–3.79], p = 1.2 × 10−3) and known, pathogenic CNVs (OR = 3.03 [1.85–5.07], p = 1.5 × 10−5). We also identified two individual, genome-wide significant loci, each conferring a substantial increase in TS risk (NRXN1 deletions, OR = 20.3, 95% CI [2.6–156.2]; CNTN6 duplications, OR = 10.1, 95% CI [2.3–45.4]). Approximately 1% of TS cases carry one of these CNVs, indicating that rare structural variation contributes significantly to the genetic architecture of TS
Alzheimer disease genetic risk factor APOE e4, and cognitive abilities in 111,739 UK Biobank participants
Background: the apolipoprotein (APOE) e4 locus is a genetic risk factor for dementia. Carriers of the e4 allele may be more
vulnerable to conditions that are independent risk factors for cognitive decline, such as cardiometabolic diseases.
Objective: we tested whether any association with APOE e4 status on cognitive ability was larger in older ages or in those
with cardiometabolic diseases.
Subjects: UK Biobank includes over 500,000 middle- and older aged adults who have undergone detailed medical and cognitive
phenotypic assessment. Around 150,000 currently have genetic data. We examined 111,739 participants with complete
genetic and cognitive data.
Methods: baseline cognitive data relating to information processing speed, memory and reasoning were used. We tested for
interactions with age and with the presence versus absence of type 2 diabetes (T2D), coronary artery disease (CAD) and hypertension.
Results: in several instances, APOE e4 dosage interacted with older age and disease presence to affect cognitive scores. When
adjusted for potentially confounding variables, there was no APOE e4 effect on the outcome variables.
Conclusions: future research in large independent cohorts should continue to investigate this important question, which has
potential implications for aetiology related to dementia and cognitive impairment
Population Structure and Cryptic Relatedness in Genetic Association Studies
We review the problem of confounding in genetic association studies, which
arises principally because of population structure and cryptic relatedness.
Many treatments of the problem consider only a simple ``island'' model of
population structure. We take a broader approach, which views population
structure and cryptic relatedness as different aspects of a single confounder:
the unobserved pedigree defining the (often distant) relationships among the
study subjects. Kinship is therefore a central concept, and we review methods
of defining and estimating kinship coefficients, both pedigree-based and
marker-based. In this unified framework we review solutions to the problem of
population structure, including family-based study designs, genomic control,
structured association, regression control, principal components adjustment and
linear mixed models. The last solution makes the most explicit use of the
kinships among the study subjects, and has an established role in the analysis
of animal and plant breeding studies. Recent computational developments mean
that analyses of human genetic association data are beginning to benefit from
its powerful tests for association, which protect against population structure
and cryptic kinship, as well as intermediate levels of confounding by the
pedigree.Comment: Published in at http://dx.doi.org/10.1214/09-STS307 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …