215 research outputs found

    Combining evidence for association from transmission disequilibrium and case-control studies using single-nucleotide polymorphisms

    Get PDF
    The aim of the present analysis is to combine evidence for association from the two most commonly used designs in genetic association analysis, the case-control design and the transmission disequilibrium test (TDT) design. The cases here are affected offspring from nuclear families and are used in both the case-control and TDT designs. As a result, inference from these designs is not independent. We applied a simple logistic regression method for combining evidence for association from case-control and TDT designs to single-nucleotide polymorphism data purchased on a region on chromosome 3, replicate 1 of the Aipotu population. Combining the evidence from the case-control and TDT designs yielded a 5–10% reduction in the standard errors of the relative risk estimates. The authors did not know the results before the analyses were conducted

    Methods to test for association between a disease and a multi-allelic marker applied to a candidate region

    Get PDF
    We report the analysis results of the Genetic Analysis Workshop 14 simulated microsatellite marker dataset, using replicate 50 from the Danacaa population. We applied several methods for association analysis of multi-allelic markers to case-control data to study the association between Kofendrerd Personality Disorder and multi-allelic markers in a candidate region previously identified by the linkage analysis. Evidence for association was found for marker D03S0127 (p < 0.01). The analyses were done without any prior knowledge of the answers

    Discussion on the paper ‘Statistical contributions to bioinformatics: Design, modelling, structure learning and integration’ by Jeffrey S. Morris and Veerabhadran Baladandayuthapani

    Get PDF
    Bioinformatics is an important research area for statisticians. This discussion provides some additional topics to the paper, namely on statistical contributions to detect differential expressed genes, for protein structure prediction, and for the analysis of highly correlated features in Glycomics datasets

    Does pathway analysis make it easier for common variants to tag rare ones?

    Get PDF
    Analyzing sequencing data is difficult because of the low frequency of rare variants, which may result in low power to detect associations. We consider pathway analysis to detect multiple common and rare variants jointly and to investigate whether analysis at the pathway level provides an alternative strategy for identifying susceptibility genes. Available pathway analysis methods for data from genome-wide association studies might not be efficient because these methods are designed to detect common variants. Here, we investigate the performance of several existing pathway analysis methods for sequencing data. In particular, we consider the global test, which does not consider linkage disequilibrium between the variants in a gene. We improve the performance of the global test by assigning larger weights to rare variants, as proposed in the weighted-sum approach. Our conclusion is that straightforward application of pathway analysis is not satisfactory; hence, when common and rare variants are jointly analyzed, larger weights should be assigned to rare variants

    Genomic prediction across populations, using pre-selected markers and differential weight models

    Get PDF
    Genomic prediction (GP) in numerically small breeds is limited due to the requirement for a large reference set. Across breed prediction has not been very successful either. Our objective was to test alternative models for across breed and multi-breed GP in a small Jersey population, utilizing prior information on marker causality. We used data on 596 Jersey bulls from new Zealand and 5503 Holstein bulls from the Netherlands, all of which had deregressed proofs for stature. Two sets of genotype data were used, one containing 357 potential causal markers identified from a multi-breed meta-GWAS on stature (top markers), while the other contained 48,912 markers on the custom 50k chip, excluding the top markers. We used models in which only one GRM (either top markers, 50k, or top plus 50k markers combined) was fitted, and models in which two GRMs (both the top and 50k) were fitted simultaneously, however with different variance components to weight the GRMs differently. Moreover, we estimated the genetic correlation(s) between the breeds (for each GRM) using a multi-trait GP model, which implicitly weights the contribution of one breed’s information to another. Across breed, we observed low accuracies of GP when the 50k markers were fitted alone (0.06) or when the top markers were added to 50k (0.15). Higher accuracy was obtained when only the top markers were fitted (0.21), whereas the highest accuracy was obtained when fitting 50k and top markers simultaneously as two independent GRMs (0.25). Multi-breed prediction outperformed both within and across breed prediction with accuracies ranging from 0.34 to 0.45, with the same trend as in across breed prediction. Based on our results, the best approach for across and multi-breed GP is to fit models that are able to isolate and differentially weight the most important markers for the trait. Keywords: Across breed genomic prediction, marker pre-selection, multi-trait model, sequence data

    Gene analysis for longitudinal family data using random-effects models

    Get PDF
    We have extended our recently developed 2-step approach for gene-based analysis to the family design and to the analysis of rare variants. The goal of this approach is to study the joint effect of multiple single-nucleotide polymorphisms that belong to a gene. First, the information in a gene is summarized by 2 variables, namely the empirical Bayes estimate capturing common variation and the number of rare variants. By using random effects for the common variants, our approach acknowledges the within-gene correlations. In the second step, the 2 summaries were included as covariates in linear mixed models. To test the null hypothesis of no association, a multivariate Wald test was applied. We analyzed the simulated data sets to assess the performance of the method. Then we applied the method to the real data set and identified a significant association between FRMD4B and diastolic blood pressure (p-value = 8.3 × 10(-12))

    Integration of gene ontology pathways with North American Rheumatoid Arthritis Consortium genome-wide association data via linear modeling

    Get PDF
    We describe an empirical Bayesian linear model for integration of functional gene annotation data with genome-wide association data. Using case-control study data from the North American Rheumatoid Arthritis Consortium and gene annotation data from the Gene Ontology, we illustrate how the method can be used to prioritize candidate genes for further investigation
    • 

    corecore