136 research outputs found

    Application of bivariate mixed counting process models to genetic analysis of rheumatoid arthritis severity

    Get PDF
    We sought to i) identify putative genetic determinants of the severity of rheumatoid arthritis in the NARAC (North American Rheumatoid Arthritis Consortium) data, ii) assess whether known candidate genes for disease status are also associated with disease severity in those affected, and iii) determine whether heterogeneity among the severity phenotypes can be explained by genetic and/or host factors. These questions are addressed by developing bivariate mixed-counting process models for numbers of tender and swollen joints to evaluate genetic association of candidate polymorphisms, such as DRB1, and selected single-nucleotide polymorphisms in known candidate genes/regions for rheumatoid arthritis, including PTPN22, and those in the regions identified by a genome-wide linkage scan of disease severity using the dense Illumina single-nucleotide polymorphism panel. The counting process framework provides a flexible approach to account for the duration of rheumatoid arthritis, an attractive feature when modeling severity of a disease. Moreover, we found a gain in efficiency when using a bivariate compared to a univariate counting process model

    Comparison of Haseman-Elston regression analyses using single, summary, and longitudinal measures of systolic blood pressure

    Get PDF
    To compare different strategies for linkage analyses of longitudinal quantitative trait measures, we applied the "revisited" Haseman-Elston (RHE) regression model (the cross product of centered sib-pair trait values is regressed on expected identical-by-descent allele sharing) to cross-sectional, summary, and repeated measurements of systolic blood pressure (SBP) values in replicate 34, randomly selected from the Genetic Analysis Workshop 13 simulated data. RHE linkage scans were performed without knowledge of the generating model using the following phenotypes derived from untreated SBP measurements: the first, the last, the mean, the ratio of the change between the first and last over time, and the estimated linear regression slope coefficient. Estimates of allele sharing in sibling pairs were obtained from the complete genotype data of Cohorts 1 and 2, but linkage analyses were restricted to the five visits of Cohort 2 siblings. Evidence for linkage was suggestive (p < 0.001) at markers neighboring SBP genes Gb35, Gs10, and Gs12, but weaker signals (p < 0.01) were obtained at markers mapping close to Gb34 and Gs11. Linkage to baseline genes Gb34 and Gb35 was best detected using the first SBP measurement, whereas linkage to slope genes Gs10-12 was best detected using the last or mean SBP value. At markers on chromosomes 13 and 21 displaying strongest linkage signals, marginal RHE-type models including repeated SBP measures were fit to test for overall and time-dependent genetic effects. These analyses assumed independent sib pairs and employed generalized estimating equations (GEE) with a first-order autoregressive working correlation structure to adjust for serial correlation present among repeated observations from the same sibling pair

    A Note on the Efficiencies of Sampling Strategies in Two-Stage Bayesian Regional Fine Mapping of a Quantitative Trait

    Get PDF
    ABSTRACT: In focused studies designed to follow up associations detected in a genome-wide association study (GWAS), investigators can proceed to fine-map a genomic region by targeted sequencing or dense genotyping of all variants in the region, aiming to identify a functional sequence variant. For the analysis of a quantitative trait, we consider a Bayesian approach to fine-mapping study design that incorporates stratification according to a promising GWAS tag SNP in the same region. Improved cost-efficiency can be achieved when the fine-mapping phase incorporates a two-stage design, with identification of a smaller set of more promising variants in a subsample taken in stage 1, followed by their evaluation in an independent stage 2 subsample. To avoid the potential negative impact of genetic model misspecification on inference we incorporate genetic model selection based on posterior probabilities for each competing model. Our simulation study shows that, compared to simple random sampling that ignores genetic information from GWAS, tag-SNP-based stratified sample allocation methods reduce the number of variants continuing to stage 2 and are more likely to promote the functional sequence variant into confirmation studies

    Using an age-at-onset phenotype with interval censoring to compare methods of segregation and linkage analysis in a candidate region for elevated systolic blood pressure

    Get PDF
    BACKGROUND: Genetic studies of complex disorders such as hypertension often utilize families selected for this outcome, usually with information obtained at a single time point. Since age-at-onset for diagnosed hypertension can vary substantially between individuals, a phenotype based on long-term follow up in unselected families can yield valuable insights into this disorder for the general population. METHODS: Genetic analyses were conducted using 2884 individuals from the largest 330 families of the Framingham Heart Study. A longitudinal phenotype was constructed using the age at an examination when systolic blood pressure (SBP) first exceeds 139 mm Hg. An interval for age-at-onset was created, since the exact time of onset was unknown. Time-fixed (sex, study cohort) and time-varying (body mass index, daily cigarette and alcohol consumption) explanatory variables were included. RESULTS: Segregation analysis for a major gene effect demonstrated that the major gene effect parameter was sensitive to the choice for age-at-onset. Linkage analyses for age-at-onset were conducted using 1537 individuals in 52 families. Evidence for putative genes identified on chromosome 17 in a previous linkage study using a quantitative SBP phenotype for these data was not confirmed. CONCLUSIONS: Interval censoring for age-at-onset should not be ignored. Further research is needed to explain the inconsistent segregation results between the different age-at-onset models (regressive threshold and proportional hazards) as well as the inconsistent linkage results between the longitudinal phenotypes (age-at-onset and quantitative)

    Genome-wide association analyses of North American Rheumatoid Arthritis Consortium and Framingham Heart Study data utilizing genome-wide linkage results

    Get PDF
    The power of genome-wide association studies can be improved by incorporating information from previous study findings, for example, results of genome-wide linkage analyses. Weighted false-discovery rate (FDR) control can incorporate genome-wide linkage scan results into the analysis of genome-wide association data by assigning single-nucleotide polymorphism (SNP) specific weights. Stratified FDR control can also be applied by stratifying the SNPs into high and low linkage strata. We applied these two FDR control methods to the data of North American Rheumatoid Arthritis Consortium (NARAC) study and the Framingham Heart Study (FHS), combining both association and linkage analysis results. For the NARAC study, we used linkage results from a previous genome scan of rheumatoid arthritis (RA) phenotype. For the FHS study, we obtained genome-wide linkage scores from the same 550 k SNP data used for the association analyses of three lipids phenotypes (HDL, LDL, TG). We confirmed some genes previously reported for association with RA and lipid phenotypes. Stratified and weighted FDR methods appear to give improved ranks to some of the replicated SNPs for the RA data, suggesting linkage scan results could provide useful information to improve genome-wide association studies

    Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification

    Get PDF
    Next generation sequencing has dramatically increased our ability to localize disease-causing variants by providing base-pair level information at costs increasingly feasible for the large sample sizes required to detect complex-trait associations. Yet, identification of causal variants within an established region of association remains a challenge. Counter-intuitively, certain factors that increase power to detect an associated region can decrease power to localize the causal variant. First, combining GWAS with imputation or low coverage sequencing to achieve the large sample sizes required for high power can have the unintended effect of producing differential genotyping error among SNPs. This tends to bias the relative evidence for association toward better genotyped SNPs. Second, re-use of GWAS data for fine-mapping exploits previous findings to ensure genome-wide significance in GWAS-associated regions. However, using GWAS findings to inform fine-mapping analysis can bias evidence away from the causal SNP toward the tag SNP and SNPs in high LD with the tag. Together these factors can reduce power to localize the causal SNP by more than half. Other strategies commonly employed to increase power to detect association, namely increasing sample size and using higher density genotyping arrays, can, in certain common scenarios, actually exacerbate these effects and further decrease power to localize causal variants. We develop a re-ranking procedure that accounts for these adverse effects and substantially improves the accuracy of causal SNP identification, often doubling the probability that the causal SNP is top-ranked. Application to the NCI BPC3 aggressive prostate cancer GWAS with imputation meta-analysis identified a new top SNP at 2 of 3 associated loci and several additional possible causal SNPs at these loci that may have otherwise been overlooked. This method is simple to implement using R scripts provided on the author's website

    Region-based analysis in genome-wide association study of Framingham Heart Study blood lipid phenotypes

    Get PDF
    Due to the high-dimensionality of single-nucleotide polymorphism (SNP) data, region-based methods are an attractive approach to the identification of genetic variation associated with a certain phenotype. A common approach to defining regions is to identify the most significant SNPs from a single-SNP association analysis, and then use a gene database to obtain a list of genes proximal to the identified SNPs. Alternatively, regions may be defined statistically, via a scan statistic. After categorizing SNPs as significant or not (based on the single-SNP association p-values), a scan statistic is useful to identify regions that contain more significant SNPs than expected by chance. Important features of this method are that regions are defined statistically, so that there is no dependence on a gene database, and both gene and inter-gene regions can be detected. In the analysis of blood-lipid phenotypes from the Framingham Heart Study (FHS), we compared statistically defined regions with those formed from the top single SNP tests. Although we missed a number of single SNPs, we also identified many additional regions not found as SNP-database regions and avoided issues related to region definition. In addition, analyses of candidate genes for high-density lipoprotein, low-density lipoprotein, and triglyceride levels suggested that associations detected with region-based statistics are also found using the scan statistic approach

    Resampling methods to reduce the selection bias in genetic effect estimation in genome-wide scans

    Get PDF
    Using the simulated data of Problem 2 for Genetic Analysis Workshop 14 (GAW14), we investigated the ability of three bootstrap-based resampling estimators (a shrinkage, an out-of-sample, and a weighted estimator) to reduce the selection bias for genetic effect estimation in genome-wide linkage scans. For the given marker density in the preliminary genome scans (7 cM for microsatellite and 3 cM for SNP), we found that the two sets of markers produce comparable results in terms of power to detect linkage, localization accuracy, and magnitude of test statistic at the peak location. At the locations detected in the scan, application of the three bootstrap-based estimators substantially reduced the upward selection bias in genetic effect estimation for both true and false positives. The relative effectiveness of the estimators depended on the true genetic effect size and the inherent power to detect it. The shrinkage estimator is recommended when the power to detect the disease locus is low. Otherwise, the weighted estimator is recommended

    Recursive partitioning models for linkage in COGA data

    Get PDF
    We have developed a recursive-partitioning (RP) algorithm for identifying phenotype and covariate groupings that interact with the evidence for linkage. This data-mining approach for detecting gene Ă— environment interactions uses genotype and covariate data on affected relative pairs to find evidence for linkage heterogeneity across covariate-defined subgroups. We adapted a likelihood-ratio based test of linkage parameterized with relative risks to a recursive partitioning framework, including a cross-validation based deviance measurement for choosing optimal tree size and a bootstrap sampling procedure for choosing robust tree structure. ALDX2 category 5 individuals were considered affected, categories 1 and 3 unaffected, and all others unknown. We sampled non-overlapping affected relative pairs from each family; therefore, we used 144 affected pairs in the RP model. Twenty pair-level covariates were defined from smoking status, maximum drinks, ethnicity, sex, and age at onset. Using the all-pairs score in GENEHUNTER, the nonparametric linkage tests showed no regions with suggestive linkage evidence. However, using the RP model, several suggestive regions were found on chromosomes 2, 4, 6, 14, and 20, with detection of associated covariates such as sex and age at onset
    • …
    corecore