1,799 research outputs found
Purifying Selection in Deeply Conserved Human Enhancers Is More Consistent than in Coding Sequences
(c) 2014 De Silva et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Power calculations using exact data simulation: A useful tool for genetic study designs.
Statistical power calculations constitute an essential first step in the planning of scientific studies. If sufficient summary statistics are available, power calculations are in principle straightforward and computationally light. In designs, which comprise distinct groups (e.g., MZ & DZ twins), sufficient statistics can be calculated within each group, and analyzed in a multi-group model. However, when the number of possible groups is prohibitively large (say, in the hundreds), power calculations on the basis of the summary statistics become impractical. In that case, researchers may resort to Monte Carlo based power studies, which involve the simulation of hundreds or thousands of replicate samples for each specified set of population parameters. Here we present exact data simulation as a third method of power calculation. Exact data simulation involves a transformation of raw data so that the data fit the hypothesized model exactly. As in power calculation with summary statistics, exact data simulation is computationally light, while the number of groups in the analysis has little bearing on the practicality of the method. The method is applied to three genetic designs for illustrative purposes
Meta-analysis of four new genome scans for lipid parameters and analysis of positional candidates in positive linkage regions
Lipid levels in plasma strongly influence the risk for coronary heart disease. To localise and subsequently identify genes affecting lipid levels, we performed four genome-wide linkage scans followed by combined linkage/association analysis. Genome-scans were performed in 701 dizygotic twin pairs from four samples with data on plasma levels of HDL- and LDL-cholesterol and their major protein constituents, apolipoprotein AI (ApoAI) and Apolipoprotein B (ApoB). To maximise power, the genome scans were analysed simultaneously using a well-established meta-analysis method that was newly applied to linkage analysis. Overall LOD scores were estimated using the means of the sample-specific quantitative trait locus (QTL) effects inversely weighted by the standard errors obtained using an inverse regression method. Possible heterogeneity was accounted for with a random effects model. Suggestive linkage for HDL-C was observed on 8p23.1 and 12q21.2 and for ApoAI on 1q21.3. For LDL-C and ApoB, linkage regions frequently coincided (2p24.1, 2q32.1, 19p13.2 and 19q13.31). Six of the putative QTLs replicated previous findings. After fine mapping, three maximum LOD scores mapped within 1cM of major candidate genes, namely APOB (LOD =2.1), LDLR (LOD =1.9) and APOE (LOD =1.7). APOB haplotypes explained 27% of the QTL effect observed for LDL-C on 2p24.1 and reduced the LOD-score by 0.82. Accounting for the effect of the LDLR and APOE haplotypes did not change the LOD score close to the LDLR gene but abolished the linkage signal at the APOE gene. In conclusion, application of a new meta-analysis approach maximised the power to detect QTLs for lipid levels and improved the precision of their location estimate. © 2005 Nature Publishing Group. All rights reserved
Parameter Estimation and Quantitative Parametric Linkage Analysis with GENEHUNTER-QMOD
Objective: We present a parametric method for linkage analysis of quantitative phenotypes. The method provides a test for linkage as well as an estimate of different phenotype parameters. We have implemented our new method in the program GENEHUNTER-QMOD and evaluated its properties by performing simulations. Methods: The phenotype is modeled as a normally distributed variable, with a separate distribution for each genotype. Parameter estimates are obtained by maximizing the LOD score over the normal distribution parameters with a gradient-based optimization called PGRAD method. Results: The PGRAD method has lower power to detect linkage than the variance components analysis (VCA) in case of a normal distribution and small pedigrees. However, it outperforms the VCA and Haseman-Elston regression for extended pedigrees, nonrandomly ascertained data and non-normally distributed phenotypes. Here, the higher power even goes along with conservativeness, while the VCA has an inflated type I error. Parameter estimation tends to underestimate residual variances but performs better for expectation values of the phenotype distributions. Conclusion: With GENEHUNTER-QMOD, a powerful new tool is provided to explicitly model quantitative phenotypes in the context of linkage analysis. It is freely available at http://www.helmholtz-muenchen.de/genepi/downloads. Copyright (C) 2012 S. Karger AG, Base
QTLs for height: results of a full genome scan in Dutch sibling pairs.
Height is a highly heritable, complex trait. At present, the genes responsible for the variation in height have not yet been identified. This paper summarizes the results of previous linkage studies and presents results of an additional linkage analysis. Using data from the Netherlands Twin Register, a sib-pair-based linkage analysis for adult height was conducted. For 513 sib-pairs from 174 families complete genome scans and adult height were available. The strongest evidence for linkage was found for a region on chromosome 6, near markers D6S1053 and D6S1031 (LOD = 2.32). This replicated previous findings in other data sets. LOD scores ranging from 1.53 to 2.04 were found for regions on chromosomes 1, 5, 8, 10, and 18. The region on chromosome 18 (LOD = 1.83) also corresponded with the results of previous studies. Several chromosomal regions are now implied in the variance in height, but further study is needed to draw definite conclusions with regard to the significance of these regions for adult heigh
A Common Variant Associated with Dyslexia Reduces Expression of the KIAA0319 Gene
Numerous genetic association studies have implicated the KIAA0319 gene on human chromosome 6p22 in dyslexia susceptibility. The causative variant(s) remains unknown but may modulate gene expression, given that (1) a dyslexia-associated haplotype has been implicated in the reduced expression of KIAA0319, and (2) the strongest association has been found for the region spanning exon 1 of KIAA0319. Here, we test the hypothesis that variant(s) responsible for reduced KIAA0319 expression resides on the risk haplotype close to the gene's transcription start site. We identified seven single-nucleotide polymorphisms on the risk haplotype immediately upstream of KIAA0319 and determined that three of these are strongly associated with multiple reading-related traits. Using luciferase-expressing constructs containing the KIAA0319 upstream region, we characterized the minimal promoter and additional putative transcriptional regulator regions. This revealed that the minor allele of rs9461045, which shows the strongest association with dyslexia in our sample (max p-value = 0.0001), confers reduced luciferase expression in both neuronal and non-neuronal cell lines. Additionally, we found that the presence of this rs9461045 dyslexia-associated allele creates a nuclear protein-binding site, likely for the transcriptional silencer OCT-1. Knocking down OCT-1 expression in the neuronal cell line SHSY5Y using an siRNA restores KIAA0319 expression from the risk haplotype to nearly that seen from the non-risk haplotype. Our study thus pinpoints a common variant as altering the function of a dyslexia candidate gene and provides an illustrative example of the strategic approach needed to dissect the molecular basis of complex genetic traits
Genome-wide linkage analysis of 972 bipolar pedigrees using single-nucleotide polymorphisms.
Because of the high costs associated with ascertainment of families, most linkage studies of Bipolar I disorder (BPI) have used relatively small samples. Moreover, the genetic information content reported in most studies has been less than 0.6. Although microsatellite markers spaced every 10 cM typically extract most of the genetic information content for larger multiplex families, they can be less informative for smaller pedigrees especially for affected sib pair kindreds. For these reasons we collaborated to pool family resources and carried out higher density genotyping. Approximately 1100 pedigrees of European ancestry were initially selected for study and were genotyped by the Center for Inherited Disease Research using the Illumina Linkage Panel 12 set of 6090 single-nucleotide polymorphisms. Of the ~1100 families, 972 were informative for further analyses, and mean information content was 0.86 after pruning for linkage disequilibrium. The 972 kindreds include 2284 cases of BPI disorder, 498 individuals with bipolar II disorder (BPII) and 702 subjects with recurrent major depression. Three affection status models (ASMs) were considered: ASM1 (BPI and schizoaffective disorder, BP cases (SABP) only), ASM2 (ASM1 cases plus BPII) and ASM3 (ASM2 cases plus recurrent major depression). Both parametric and non-parametric linkage methods were carried out. The strongest findings occurred at 6q21 (non-parametric pairs LOD 3.4 for rs1046943 at 119 cM) and 9q21 (non-parametric pairs logarithm of odds (LOD) 3.4 for rs722642 at 78 cM) using only BPI and schizoaffective (SA), BP cases. Both results met genome-wide significant criteria, although neither was significant after correction for multiple analyses. We also inspected parametric scores for the larger multiplex families to identify possible rare susceptibility loci. In this analysis, we observed 59 parametric LODs of 2 or greater, many of which are likely to be close to maximum possible scores. Although some linkage findings may be false positives, the results could help prioritize the search for rare variants using whole exome or genome sequencing
Linkage disequilibrium in young genetically isolated Dutch population
The design and feasibility of genetic studies of complex diseases are critically dependent on the extent and distribution of linkage disequilibrium (LD) across the genome and between different populations. We have examined genomewide and region-specific LD in a young genetically isolated population identified in the Netherlands by genotyping approximately 800 Short Tandem Repeat markers distributed genomewide across 58 individuals. Several regions were an
Genetic prediction of complex traits: integrating infinitesimal and marked genetic effects
Genetic prediction for complex traits is usually based on models including individual (infinitesimal) or marker effects. Here, we concentrate on models including both the individual and the marker effects. In particular, we develop a ''Mendelian segregation'' model combining infinitesimal effects for base individuals and realized Mendelian sampling in descendants described by the available DNA data. The model is illustrated with an example and the analyses of a public simulated data file. Further, the potential contribution of such models is assessed by simulation. Accuracy, measured as the correlation between true (simulated) and predicted genetic values, was similar for all models compared under different genetic backgrounds. As expected, the segregation model is worthwhile when markers capture a low fraction of total genetic variance. (Résumé d'auteur
Genome-wide study of association and interaction with maternal cytomegalovirus infection suggests new schizophrenia loci.
Genetic and environmental components as well as their interaction contribute to the risk of schizophrenia, making it highly relevant to include environmental factors in genetic studies of schizophrenia. This study comprises genome-wide association (GWA) and follow-up analyses of all individuals born in Denmark since 1981 and diagnosed with schizophrenia as well as controls from the same birth cohort. Furthermore, we present the first genome-wide interaction survey of single nucleotide polymorphisms (SNPs) and maternal cytomegalovirus (CMV) infection. The GWA analysis included 888 cases and 882 controls, and the follow-up investigation of the top GWA results was performed in independent Danish (1396 cases and 1803 controls) and German-Dutch (1169 cases, 3714 controls) samples. The SNPs most strongly associated in the single-marker analysis of the combined Danish samples were rs4757144 in ARNTL (P=3.78 × 10(-6)) and rs8057927 in CDH13 (P=1.39 × 10(-5)). Both genes have previously been linked to schizophrenia or other psychiatric disorders. The strongest associated SNP in the combined analysis, including Danish and German-Dutch samples, was rs12922317 in RUNDC2A (P=9.04 × 10(-7)). A region-based analysis summarizing independent signals in segments of 100 kb identified a new region-based genome-wide significant locus overlapping the gene ZEB1 (P=7.0 × 10(-7)). This signal was replicated in the follow-up analysis (P=2.3 × 10(-2)). Significant interaction with maternal CMV infection was found for rs7902091 (P(SNP × CMV)=7.3 × 10(-7)) in CTNNA3, a gene not previously implicated in schizophrenia, stressing the importance of including environmental factors in genetic studies
- …
