248 research outputs found

    Discussion on the paper ‘Statistical contributions to bioinformatics: Design, modelling, structure learning and integration’ by Jeffrey S. Morris and Veerabhadran Baladandayuthapani

    Get PDF
    Bioinformatics is an important research area for statisticians. This discussion provides some additional topics to the paper, namely on statistical contributions to detect differential expressed genes, for protein structure prediction, and for the analysis of highly correlated features in Glycomics datasets

    Gene analysis for longitudinal family data using random-effects models

    Get PDF
    We have extended our recently developed 2-step approach for gene-based analysis to the family design and to the analysis of rare variants. The goal of this approach is to study the joint effect of multiple single-nucleotide polymorphisms that belong to a gene. First, the information in a gene is summarized by 2 variables, namely the empirical Bayes estimate capturing common variation and the number of rare variants. By using random effects for the common variants, our approach acknowledges the within-gene correlations. In the second step, the 2 summaries were included as covariates in linear mixed models. To test the null hypothesis of no association, a multivariate Wald test was applied. We analyzed the simulated data sets to assess the performance of the method. Then we applied the method to the real data set and identified a significant association between FRMD4B and diastolic blood pressure (p-value = 8.3 × 10(-12))

    Genetic, household and spatial clustering of leprosy on an island in Indonesia: a population-based study

    Get PDF
    Abstract Background It is generally accepted that genetic factors play a role in susceptibility to both leprosy per se and leprosy type, but only few studies have tempted to quantify this. Estimating the contribution of genetic factors to clustering of leprosy within families is difficult since these persons often share the same environment. The first aim of this study was to test which correlation structure (genetic, household or spatial) gives the best explanation for the distribution of leprosy patients and seropositive persons and second to quantify the role of genetic factors in the occurrence of leprosy and seropositivity. Methods The three correlation structures were proposed for population data (n = 560), collected on a geographically isolated island highly endemic for leprosy, to explain the distribution of leprosy per se, leprosy type and persons harbouring Mycobacterium leprae-specific antibodies. Heritability estimates and risk ratios for siblings were calculated to quantify the genetic effect. Leprosy was clinically diagnosed and specific anti-M. leprae antibodies were measured using ELISA. Results For leprosy per se in the total population the genetic correlation structure fitted best. In the population with relative stable household status (persons under 21 years and above 39 years) all structures were significant. For multibacillary leprosy (MB) genetic factors seemed more important than for paucibacillary leprosy. Seropositivity could be explained best by the spatial model, but the genetic model was also significant. Heritability was 57% for leprosy per se and 31% for seropositivity. Conclusion Genetic factors seem to play an important role in the clustering of patients with a more advanced form of leprosy, and they could explain more than half of the total phenotypic variance.</p

    Community deworming alleviates geohelminth-induced immune hyporesponsiveness

    Get PDF
    In cross-sectional studies, chronic helminth infections have been associated with immunological hyporesponsiveness that can affect responses to unrelated antigens. To study the immunological effects of deworming, we conducted a cluster-randomized, double-blind, placebo-controlled trial in Indonesia and assigned 954 households to receive albendazole or placebo once every 3 mo for 2 y. Helminth-specific and nonspecific whole-blood cytokine responses were assessed in 1,059 subjects of all ages, whereas phenotyping of regulatory molecules was undertaken in 121 school-aged children. All measurements were performed before and at 9 and 21 mo after initiation of treatment. Anthelmintic treatment resulted in significant increases in proinflammatory cytokine responses to Plasmodium falciparum-infected red blood cells (PfRBCs) and mitogen, with the largest effect on TNF responses to PfRBCs at 9 mo—estimate [95% confidence interval], 0.37 [0.21–0.53], P value over time (Ptime) < 0.0001. Although the frequency of regulatory T cells did not change after treatment, there was a significant decline in the expression of the inhibitory molecule cytotoxic T lymphocyte-associated antigen 4 (CTLA-4) on CD4+ T cells of albendazole-treated individuals, –0.060 [–0.107 to –0.013] and –0.057 [–0.105 to –0.008] at 9 and 21 mo, respectively; Ptime = 0.017. This trial shows the capacity of helminths to up-regulate inhibitory molecules and to suppress proinflammatory immune responses in humans. This could help to explain the inferior immunological responses to vaccines and lower prevalence of inflammatory diseases in low- compared with high-income countries

    Evaluation of O2PLS in Omics data integration

    Get PDF
    Background: Rapid computational and technological developments made large amounts of omics data available in different biological levels. It is becoming clear that simultaneous data analysis methods are needed for better interpretation and understanding of the underlying systems biology. Different methods have been proposed for this task, among them Partial Least Squares (PLS) related methods. To also deal with orthogonal variation, systematic variation in the data unrelated to one another, we consider the Two-way Orthogonal PLS (O2PLS): an integrative data analysis method which is capable of modeling systematic variation, while providing more parsimonious models aiding interpretation. Results: A simulation study to assess the performance of O2PLS showed positive results in both low and higher dimensions. More noise (50 % of the data) only affected the systematic part estimates. A data analysis was conducted using data on metabolomics and transcriptomics from a large Finnish cohort (DILGOM). A previous sequential study, using the same data, showed significant correlations between the Lipo-Leukocyte (LL) module and lipoprotein metabolites. The O2PLS results were in agreement with these findings, identifying almost the same set of co-varying variables. Moreover, our integrative approach identified other associative genes and metabolites, while taking into account systematic variation in the data. Including orthogonal components enhanced overall fit, but the orthogonal variation was difficult to interpret. Conclusions: Simulations showed that the O2PLS estimates were close to the true parameters in both low and higher dimensions. In the presence of more noise (50 %), the orthogonal part estimates could not distinguish well between joint and unique variation. The joint estimates were not systematically affected. Simultaneous analysis with O2PLS on metabolome and transcriptome data showed that the LL module, together with VLDL and HDL metabolites, were important for the metabolomic and transcriptomic relation. This is in agreement with an earlier study. In addition more gene expression and metabolites are identified being important for the joint covariation

    Genomic prediction across populations, using pre-selected markers and differential weight models

    Get PDF
    Genomic prediction (GP) in numerically small breeds is limited due to the requirement for a large reference set. Across breed prediction has not been very successful either. Our objective was to test alternative models for across breed and multi-breed GP in a small Jersey population, utilizing prior information on marker causality. We used data on 596 Jersey bulls from new Zealand and 5503 Holstein bulls from the Netherlands, all of which had deregressed proofs for stature. Two sets of genotype data were used, one containing 357 potential causal markers identified from a multi-breed meta-GWAS on stature (top markers), while the other contained 48,912 markers on the custom 50k chip, excluding the top markers. We used models in which only one GRM (either top markers, 50k, or top plus 50k markers combined) was fitted, and models in which two GRMs (both the top and 50k) were fitted simultaneously, however with different variance components to weight the GRMs differently. Moreover, we estimated the genetic correlation(s) between the breeds (for each GRM) using a multi-trait GP model, which implicitly weights the contribution of one breed’s information to another. Across breed, we observed low accuracies of GP when the 50k markers were fitted alone (0.06) or when the top markers were added to 50k (0.15). Higher accuracy was obtained when only the top markers were fitted (0.21), whereas the highest accuracy was obtained when fitting 50k and top markers simultaneously as two independent GRMs (0.25). Multi-breed prediction outperformed both within and across breed prediction with accuracies ranging from 0.34 to 0.45, with the same trend as in across breed prediction. Based on our results, the best approach for across and multi-breed GP is to fit models that are able to isolate and differentially weight the most important markers for the trait. Keywords: Across breed genomic prediction, marker pre-selection, multi-trait model, sequence data

    Non-homologous end-joining pathway associated with occurrence of myocardial infarction: gene set analysis of genome-wide association study data

    Get PDF
    &lt;p&gt;Purpose: DNA repair deficiencies have been postulated to play a role in the development and progression of cardiovascular disease (CVD). The hypothesis is that DNA damage accumulating with age may induce cell death, which promotes formation of unstable plaques. Defects in DNA repair mechanisms may therefore increase the risk of CVD events. We examined whether the joints effect of common genetic variants in 5 DNA repair pathways may influence the risk of CVD events.&lt;/p&gt; &lt;p&gt;Methods: The PLINK set-based test was used to examine the association to myocardial infarction (MI) of the DNA repair pathway in GWAS data of 866 subjects of the GENetic DEterminants of Restenosis (GENDER) study and 5,244 subjects of the PROspective Study of Pravastatin in the Elderly at Risk (PROSPER) study. We included the main DNA repair pathways (base excision repair, nucleotide excision repair, mismatch repair, homologous recombination and non-homologous end-joining (NHEJ)) in the analysis.&lt;/p&gt; &lt;p&gt;Results: The NHEJ pathway was associated with the occurrence of MI in both GENDER (P = 0.0083) and PROSPER (P = 0.014). This association was mainly driven by genetic variation in the MRE11A gene (PGENDER = 0.0001 and PPROSPER = 0.002). The homologous recombination pathway was associated with MI in GENDER only (P = 0.011), for the other pathways no associations were observed.&lt;/p&gt; &lt;p&gt;Conclusion: This is the first study analyzing the joint effect of common genetic variation in DNA repair pathways and the risk of CVD events, demonstrating an association between the NHEJ pathway and MI in 2 different cohorts.&lt;/p&gt
    corecore