486 research outputs found

    Gene expression in large pedigrees: analytic approaches.

    Get PDF
    BackgroundWe currently have the ability to quantify transcript abundance of messenger RNA (mRNA), genome-wide, using microarray technologies. Analyzing genotype, phenotype and expression data from 20 pedigrees, the members of our Genetic Analysis Workshop (GAW) 19 gene expression group published 9 papers, tackling some timely and important problems and questions. To study the complexity and interrelationships of genetics and gene expression, we used established statistical tools, developed newer statistical tools, and developed and applied extensions to these tools.MethodsTo study gene expression correlations in the pedigree members (without incorporating genotype or trait data into the analysis), 2 papers used principal components analysis, weighted gene coexpression network analysis, meta-analyses, gene enrichment analyses, and linear mixed models. To explore the relationship between genetics and gene expression, 2 papers studied expression quantitative trait locus allelic heterogeneity through conditional association analyses, and epistasis through interaction analyses. A third paper assessed the feasibility of applying allele-specific binding to filter potential regulatory single-nucleotide polymorphisms (SNPs). Analytic approaches included linear mixed models based on measured genotypes in pedigrees, permutation tests, and covariance kernels. To incorporate both genotype and phenotype data with gene expression, 4 groups employed linear mixed models, nonparametric weighted U statistics, structural equation modeling, Bayesian unified frameworks, and multiple regression.Results and discussionRegarding the analysis of pedigree data, we found that gene expression is familial, indicating that at least 1 factor for pedigree membership or multiple factors for the degree of relationship should be included in analyses, and we developed a method to adjust for familiality prior to conducting weighted co-expression gene network analysis. For SNP association and conditional analyses, we found FaST-LMM (Factored Spectrally Transformed Linear Mixed Model) and SOLAR-MGA (Sequential Oligogenic Linkage Analysis Routines -Major Gene Analysis) have similar type 1 and type 2 errors and can be used almost interchangeably. To improve the power and precision of association tests, prior knowledge of DNase-I hypersensitivity sites or other relevant biological annotations can be incorporated into the analyses. On a biological level, eQTL (expression quantitative trait loci) are genetically complex, exhibiting both allelic heterogeneity and epistasis. Including both genotype and phenotype data together with measurements of gene expression was found to be generally advantageous in terms of generating improved levels of significance and in providing more interpretable biological models.ConclusionsPedigrees can be used to conduct analyses of and enhance gene expression studies

    Analysis of North American Rheumatoid Arthritis Consortium data using a penalized logistic regression approach

    Get PDF
    We applied a penalized regression approach to single-nucleotide polymorphisms in regions on chromosomes 1, 6, and 9 of the North American Rheumatoid Arthritis Consortium data. Results were compared with a standard single-locus association test. Overall, the penalized regression approach did not appear to offer any advantage with respect to either detection or localization of disease-associated polymorphisms, compared with the single-locus approach

    Exploring causality via identification of SNPs or haplotypes responsible for a linkage signal

    Get PDF
    In a small chromosomal region, a number of polymorphisms may be both linked to and associated with a disease. Distinguishing the potential causal sites from those indirectly associated due to linkage disequilibrium (LD) with a causal site is an important problem. This problem may be approached by determining which of the associations can explain the observed linkage signal. Recently, several methods have been proposed to aid in the identification of disease associated polymorphisms that may explain an observed linkage signal, using genotype data from affected sib pairs (ASPs) [Li et al. [2005] Am. J. Hum. Genet. 76:934–949; Sun et al. [2002] Am. J. Hum. Genet. 70:399–411]. These methods can be used to test the null hypothesis that a candidate single nucleotide polymorphism (SNP) is the sole causal variant in the region, or is in complete LD with the sole causal variant in the region. We extend variations of these methods to test for complete LD between a disease locus and haplotypes composed of two or more tightly linked candidate SNPs. We study properties of the proposed methods by simulation and apply them to type 1 diabetes data for ASPs and their parents at candidate SNP and microsatellite marker loci in the Insulin (INS) gene region. Genet. Epidemiol. 31:2727–740, 2007. © 2007 Wiley-Liss, Inc

    Common polymorphism in H19 associated with birthweight and cord blood IGF-II levels in humans.

    Get PDF
    BACKGROUND: Common genetic variation at genes that are imprinted and exclusively maternally expressed could explain the apparent maternal-specific inheritance of low birthweight reported in large family pedigrees. We identified ten single nucleotide polymorphisms (SNPs) in H19, and we genotyped three of these SNPs in families from the contemporary ALSPAC UK birth cohort (1,696 children, 822 mothers and 661 fathers) in order to explore associations with size at birth and cord blood IGF-II levels. RESULTS: Both offspring's and mother's H19 2992C>T SNP genotypes showed associations with offspring birthweight (P = 0.03 to P = 0.003) and mother's genotype was also associated with cord blood IGF-II levels (P = 0.0003 to P = 0.0001). The offspring genotype association with birthweight was independent of mother's genotype (P = 0.01 to P = 0.007). However, mother's untransmitted H19 2992T allele was also associated with larger birthweight (P = 0.04) and higher cord blood IGF-II levels (P = 0.002), suggesting a direct effect of mother's genotype on placental IGF-II expression and fetal growth. The association between mother's untransmitted allele and cord blood IGF-II levels was more apparent in offspring of first pregnancies than subsequent pregnancies (P-interaction = 0.03). Study of the independent Cambridge birth cohort with available DNA in mothers (N = 646) provided additional support for mother's H19 2992 genotype associations with birthweight (P = 0.04) and with mother's glucose levels (P = 0.01) in first pregnancies. CONCLUSION: The common H19 2992T allele, in the mother or offspring or both, may confer reduced fetal growth restraint, as indicated by associations with larger offspring birth size, higher cord blood IGF-II levels, and lower compensatory early postnatal catch-up weight gain, that are more evident among mother's smaller first-born infants.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    Linkage and association analysis of GAW15 simulated data: fine-mapping of chromosome 6 region

    Get PDF
    We performed linkage and family-based association analysis across chromosomes 1–22 in Replicates 1–5 of the Genetic Analysis Workshop 15 simulated data. Linkage analysis was performed using the Kong and Cox allele-sharing test as implemented in the program Merlin. Association analysis was performed using the transmission/disequilibrium test (TDT). A region on chromosome 6 was consistently highlighted as showing significant linkage to and association with the disease trait. We focused in on this region and performed fine-mapping using stepwise regression approaches using the case/control and family-based data. In this region, we also applied several new methods, implemented in the computer programs LAMP and Graphminer, respectively, that have recently been proposed for association analysis with family and/or case/control data. All methods confirmed the highly significant associations previously observed. Differentiating between potentially causal single nucleotide polymorphisms (SNPs) and other non-causal loci (associated with disease merely due to linkage disequilibrium) proved to be problematic. However, in most replicates we did identify two SNPs (either SNPs 3437 and 3439 from the dense SNP set, or SNPs 153 and 3437 from the combined non-dense/dense SNP set) that together explain most of the observed disease association in the DR/C locus region, and an additional SNP (3931 or 3933) that accounts for the association 5 cM away at locus D

    Transcriptional regulation of PNPLA3 and its impact on susceptibility to nonalcoholic fatty liver Disease (NAFLD) in humans

    Get PDF
    The increased expression of PNPLA3148M leads to hepatosteatosis in mice. This study aims to investigate the genetic control of hepatic PNPLA3 transcription and to explore its impact on NAFLD risk in humans. Through a locus-wide expression quantitative trait loci (eQTL) mapping in two human liver sample sets, a PNPLA3 intronic SNP, rs139051 A>G was identified as a significant eQTL (p = 6.6×10-8) influencing PNPLA3 transcription, with the A allele significantly associated with increased PNPLA3 mRNA. An electrophoresis mobility shift assay further demonstrated that the A allele has enhanced affinity to nuclear proteins than the G allele. The impact of this eQTL on NAFLD risk was further tested in three independent populations. We found that rs139051 did not independently affect the NAFLD risk, whilst rs738409 did not significantly modulate PNPLA3 transcription but was associated with NAFLD risk. The A-G haplotype associated with higher transcription of the disease-risk rs738409 G allele conferred similar risk for NAFLD compared to the G-G haplotype that possesses a lower transcription level. Our study suggests that the pathogenic role of PNPLA3148M in NAFLD is independent of the gene transcription in humans, which may be attributed to the high endogenous transcription level of PNPLA3 gene in human livers

    Linkage analysis of GAW14 simulated data: comparison of multimarker, multipoint, and conditional approaches

    Get PDF
    The purposes of this study were 1) to examine the performance of a new multimarker regression approach for model-free linkage analysis in comparison to a conventional multipoint approach, and 2) to determine the whether a conditioning strategy would improve the performance of the conventional multipoint method when applied to data from two interacting loci. Linkage analysis of the Kofendrerd Personality Disorder phenotype to chromosomes 1 and 3 was performed in three populations for all 100 replicates of the Genetic Analysis Workshop 14 simulated data. Three approaches were used: a conventional multipoint analysis using the Zlr statistic as calculated in the program ALLEGRO; a conditioning approach in which the per-family contribution on one chromosome was weighted according to evidence for linkage on the other chromosome; and a novel multimarker regression approach. The multipoint and multimarker approaches were generally successful in localizing known susceptibility loci on chromosomes 1 and 3, and were found to give broadly similar results. No advantage was found with the per-family conditioning approach. The effect on power and type I error of different choices of weighting scheme (to account for different numbers of affected siblings) in the multimarker approach was examined

    Bayesian network analysis incorporating genetic anchors complements conventional Mendelian randomization approaches for exploratory analysis of causal relationships in complex data

    Get PDF
    Mendelian randomization (MR) implemented through instrumental variables analysis is an increasingly popular causal inference tool used in genetic epidemiology. But it can have limitations for evaluating simultaneous causal relationships in complex data sets that include, for example, multiple genetic predictors and multiple potential risk factors associated with the same genetic variant. Here we use real and simulated data to investigate Bayesian network analysis (BN) with the incorporation of directed arcs, representing genetic anchors, as an alternative approach. A Bayesian network describes the conditional dependencies/independencies of variables using a graphical model (a directed acyclic graph) with an accompanying joint probability. In real data, we found BN could be used to infer simultaneous causal relationships that confirmed the individual causal relationships suggested by bi-directional MR, while allowing for the existence of potential horizontal pleiotropy (that would violate MR assumptions). In simulated data, BN with two directional anchors (mimicking genetic instruments) had greater power for a fixed type 1 error than bi-directional MR, while BN with a single directional anchor performed better than or as well as bi-directional MR. Both BN and MR could be adversely affected by violations of their underlying assumptions (such as genetic confounding due to unmeasured horizontal pleiotropy). BN with no directional anchor generated inference that was no better than by chance, emphasizing the importance of directional anchors in BN (as in MR). Under highly pleiotropic simulated scenarios, BN outperformed both MR (and its recent extensions) and two recently-proposed alternative approaches: a multi-SNP mediation intersection-union test (SMUT) and a latent causal variable (LCV) test. We conclude that BN incorporating genetic anchors is a useful complementary method to conventional MR for exploring causal relationships in complex data sets such as those generated from modern "omics" technologies
    corecore