644 research outputs found

    Bayesian spatial and temporal epidemiology of non-communicable diseases and mortality

    Get PDF
    Spatial epidemiology combines spatial statistical modelling and disease epidemiology for studying geographic variation in mortality and morbidity. The effects of putative risk factors may be examined using ecological regression models. On the other hand, age-period-cohort models can be used to study the variation of mortality and morbidity through time. Bayesian hierarchical statistical models offer a flexible framework for these studies and enable the estimation of uncertainties in the results. The models are usually estimated using computer-intensive Markov chain Monte Carlo simulations. In this dissertation the first four publications present practical epidemiological studies on geographic variation in non-communicable diseases in Finland. In the last publication we study the long-time variation in all-cause mortality in several European countries. New statistical models are developed for these studies. This work provides new epidemiological information on the geographic variation of acute myocardial infarctions (AMI), ischaemic stroke and parkinsonism in Finland. An extended model for studying shared and disease specific geographic variation is presented using data on AMI and ischaemic stroke incidence. Existing results on the inverse association of water hardness and AMI are refined. New models for interpolation of geochemical data with non-detected values are presented with case studies using real data. Finally, the Bayesian age-period-cohort model is extended with versatile interactions and better prediction ability. The model is then used to study long-term variation in mortality in Europe

    Informative Bayesian Neural Network Priors for Weak Signals

    Get PDF
    Funding Information: ∗This work was supported by the Academy of Finland (Flagship programme: Finnish Center for Artificial Intelligence, FCAI, grants 319264, 292334, 286607, 294015, 336033, 315896, 341763), and EU Horizon 2020 (INTERVENE, grant no. 101016775). We also acknowledge the computational resources provided by the Aalto Science-IT Project from Computer Science IT. †Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Finland, [email protected] ‡Finnish Institute for Health and Welfare (THL), Finland §Institute for Molecular Medicine Finland, FIMM-HiLIFE, Helsinki, Finland ¶Department of Computer Science, University of Manchester, UK ‖Equal contribution. Funding Information: This work was supported by the Academy of Finland (Flagship programme: Finnish Center for Artificial Intelligence, FCAI, grants 319264, 292334, 286607, 294015, 336033, 315896, 341763), and EU Horizon 2020 (INTERVENE, grant no. 101016775). We also acknowledge the computational resources provided by the Aalto Science-IT Project from Computer Science IT. Publisher Copyright: © 2022 International Society for Bayesian AnalysisEncoding domain knowledge into the prior over the high-dimensional weight space of a neural network is challenging but essential in applications with limited data and weak signals. Two types of domain knowledge are commonly available in scientific applications: 1. feature sparsity (fraction of features deemed relevant); 2. signal-to-noise ratio, quantified, for instance, as the proportion of variance explained. We show how to encode both types of domain knowledge into the widely used Gaussian scale mixture priors with Automatic Relevance Determination. Specifically, we propose a new joint prior over the local (i.e., feature-specific) scale parameters that encodes knowledge about feature sparsity, and a Stein gradient optimization to tune the hyperparameters in such a way that the distribution induced on the model’s proportion of variance explained matches the prior distribution. We show empirically that the new prior improves prediction accuracy compared to existing neural network priors on publicly available datasets and in a genetics application where signals are weak and sparse, often outperforming even computationally intensive cross-validation for hyperparameter tuning.Peer reviewe

    Distribution and Medical Impact of Loss-of-Function Variants in the Finnish Founder Population

    Get PDF
    Exome sequencing studies in complex diseases are challenged by the allelic heterogeneity, large number and modest effect sizes of associated variants on disease risk and the presence of large numbers of neutral variants, even in phenotypically relevant genes. Isolated populations with recent bottlenecks offer advantages for studying rare variants in complex diseases as they have deleterious variants that are present at higher frequencies as well as a substantial reduction in rare neutral variation. To explore the potential of the Finnish founder population for studying low-frequency (0.5-5%) variants in complex diseases, we compared exome sequence data on 3,000 Finns to the same number of non-Finnish Europeans and discovered that, despite having fewer variable sites overall, the average Finn has more low-frequency loss-of-function variants and complete gene knockouts. We then used several well-characterized Finnish population cohorts to study the phenotypic effects of 83 enriched loss-of-function variants across 60 phenotypes in 36,262 Finns. Using a deep set of quantitative traits collected on these cohorts, we show 5 associations (p<5×10⁻⁸) including splice variants in LPA that lowered plasma lipoprotein(a) levels (P = 1.5×10⁻¹¹⁷). Through accessing the national medical records of these participants, we evaluate the LPA finding via Mendelian randomization and confirm that these splice variants confer protection from cardiovascular disease (OR = 0.84, P = 3×10⁻⁴), demonstrating for the first time the correlation between very low levels of LPA in humans with potential therapeutic implications for cardiovascular diseases. More generally, this study articulates substantial advantages for studying the role of rare variation in complex phenotypes in founder populations like the Finns and by combining a unique population genetic history with data from large population cohorts and centralized research access to National Health RegistersPublic Library of Science open acces

    Modelling spatial patterns in host-associated microbial communities

    Get PDF
    Microbial communities exhibit spatial structure at different scales, due to constant interactions with their environment and dispersal limitation. While this spatial structure is often considered in studies focusing on free-living environmental communities, it has received less attention in the context of host-associated microbial communities or microbiota. The wider adoption of methods accounting for spatial variation in these communities will help to address open questions in basic microbial ecology as well as realize the full potential of microbiome-aided medicine. Here, we first overview known factors affecting the composition of microbiota across diverse host types and at different scales, with a focus on the human gut as one of the most actively studied microbiota. We outline a number of topical open questions in the field related to spatial variation and patterns. We then review the existing methodology for the spatial modelling of microbiota. We suggest that methodology from related fields, such as systems biology and macro-organismal ecology, could be adapted to obtain more accurate models of spatial structure. We further posit that methodological developments in the spatial modelling and analysis of microbiota could in turn broadly benefit theoretical and applied ecology and contribute to the development of novel industrial and clinical applications.Peer reviewe

    Gene-gene interaction detection with deep learning

    Get PDF
    The extent to which genetic interactions affect observed phenotypes is generally unknown because current interaction detection approaches only consider simple interactions between top SNPs of genes. We introduce an open-source framework for increasing the power of interaction detection by considering all SNPs within a selected set of genes and complex interactions between them, beyond only the currently considered multiplicative relationships. In brief, the relation between SNPs and a phenotype is captured by a neural network, and the interactions are quantified by Shapley scores between hidden nodes, which are gene representations that optimally combine information from the corresponding SNPs. Additionally, we design a permutation procedure tailored for neural networks to assess the significance of interactions, which outperformed existing alternatives on simulated datasets with complex interactions, and in a cholesterol study on the UK Biobank it detected nine interactions which replicated on an independent FINRISK dataset.An open-source framework combines deep learning and permutations of gene interaction neural networks to detect complex gene-gene interactions and their significance in contributions to phenotypes.Peer reviewe

    Polygenic Risk Scores Predict Hypertension Onset and Cardiovascular Risk

    Get PDF
    Although genetic risk scores have been used to predict hypertension, their utility in the clinical setting remains uncertain. Our study comprised N=218 792 FinnGen participants (mean age 58 years, 56% women) and N=22 624 well-phenotyped FINRISK participants (mean age 50 years, 53% women). We used public genome-wide association data to compute polygenic risk scores (PRSs) for systolic and diastolic blood pressure (BP). Using time-to-event analysis, we then assessed (1) the association of BP PRSs with hypertension and cardiovascular disease (CVD) in FinnGen and (2) the improvement in model discrimination when combining BP PRSs with the validated 4- and 10-year clinical risk scores for hypertension and CVD in FINRISK. In FinnGen, compared with having a 20 to 80 percentile range PRS, a PRS in the highest 2.5% conferred 2.3-fold (95% CI, 2.2-2.4) risk of hypertension and 10.6 years (95% CI, 9.9-11.4) earlier hypertension onset. In subgroup analyses, this risk was only 1.6-fold (95% CI, 1.5-1.7) for late-onset hypertension (age >= 55 years) but 2.8-fold (95% CI, 2.6-2.9) for early-onset hypertension (agePeer reviewe

    Associations between circulating metabolites and arterial stiffness

    Get PDF

    Matrix metalloproteinase-8 and tissue inhibitor of matrix metalloproteinase-1 predict incident cardiovascular disease events and all-cause mortality in a population-based cohort

    Get PDF
    Background Extracellular matrix degrading proteases and their regulators play an important role in atherogenesis and subsequent plaque rupture leading to acute cardiovascular manifestations. Design and methods In this prospective cohort study, we investigated the prognostic value of circulating matrix metalloproteinase-8, tissue inhibitor of matrix metalloproteinase-1 concentrations, the ratio of matrix metalloproteinase-8/ tissue inhibitor of matrix metalloproteinase-1 and, for comparison, myeloperoxidase and C-reactive protein concentrations for incident cardiovascular disease endpoints. The population-based FINRISK97 cohort comprised 7928 persons without cardiovascular disease at baseline. The baseline survey included a clinical examination and blood sampling. During a 13-year follow-up the endpoints were ascertained through national healthcare registers. The associations of measured biomarkers with the endpoints, including cardiovascular disease event, coronary artery disease, acute myocardial infarction, stroke and all-cause death, were analysed using Cox regression models. Discrimination and reclassification models were used to evaluate the clinical implications of the biomarkers. Results Serum tissue inhibitor of matrix metalloproteinase-1 and C-reactive protein concentrations were associated significantly with increased risk for all studied endpoints. Additionally, matrix metalloproteinase-8 concentration was associated with the risk for a coronary artery disease event, myocardial infarction and death, and myeloperoxidase concentration with the risk for cardiovascular disease events, stroke and death. The only significant association for the matrix metalloproteinase-8/ tissue inhibitor of matrix metalloproteinase-1 ratio was observed with the risk for myocardial infarction. Adding tissue inhibitor of matrix metalloproteinase-1 to the established risk profile improved risk discrimination of myocardial infarction (p=0.039) and death (0.001). Both matrix metalloproteinase-8 (5.2%, p <0.001) and tissue inhibitor of matrix metalloproteinase-1 (12.9%, p <0.001) provided significant clinical net reclassification improvement for death. Conclusions Serum matrix metalloproteinase-8 and tissue inhibitor of matrix metalloproteinase-1 can be considered as biomarkers of incident cardiovascular disease events and death.Peer reviewe
    corecore