580 research outputs found

    Polygenic Risk Score for Cardiovascular Diseases in Artificial Intelligence Paradigm

    Get PDF
    Cardiovascular disease (CVD) related mortality and morbidity heavily strain society. The relationship between external risk factors and our genetics have not been well established. It is widely acknowledged that environmental influence and individual behaviours play a significant role in CVD vulnerability, leading to the development of polygenic risk scores (PRS). We employed the PRISMA search method to locate pertinent research and literature to extensively review artificial intelligence (AI)-based PRS models for CVD risk prediction. Furthermore, we analyzed and compared conventional vs. AI-based solutions for PRS. We summarized the recent advances in our understanding of the use of AI-based PRS for risk prediction of CVD. Our study proposes three hypotheses: i) Multiple genetic variations and risk factors can be incorporated into AI-based PRS to improve the accuracy of CVD risk predicting. ii) AI-based PRS for CVD circumvents the drawbacks of conventional PRS calculators by incorporating a larger variety of genetic and non-genetic components, allowing for more precise and individualised risk estimations. iii) Using AI approaches, it is possible to significantly reduce the dimensionality of huge genomic datasets, resulting in more accurate and effective disease risk prediction models. Our study highlighted that the AI-PRS model outperformed traditional PRS calculators in predicting CVD risk. Furthermore, using AI-based methods to calculate PRS may increase the precision of risk predictions for CVD and have significant ramifications for individualized prevention and treatment plans

    Polygenic Risk Score for Cardiovascular Diseases in Artificial Intelligence Paradigm: A Review

    Get PDF
    Cardiovascular disease (CVD) related mortality and morbidity heavily strain society. The relationship between external risk factors and our genetics have not been well established. It is widely acknowledged that environmental influence and individual behaviours play a significant role in CVD vulnerability, leading to the development of polygenic risk scores (PRS). We employed the PRISMA search method to locate pertinent research and literature to extensively review artificial intelligence (AI)-based PRS models for CVD risk prediction. Furthermore, we analyzed and compared conventional vs. AI-based solutions for PRS. We summarized the recent advances in our understanding of the use of AI-based PRS for risk prediction of CVD. Our study proposes three hypotheses: i) Multiple genetic variations and risk factors can be incorporated into AI-based PRS to improve the accuracy of CVD risk predicting. ii) AI-based PRS for CVD circumvents the drawbacks of conventional PRS calculators by incorporating a larger variety of genetic and non-genetic components, allowing for more precise and individualised risk estimations. iii) Using AI approaches, it is possible to significantly reduce the dimensionality of huge genomic datasets, resulting in more accurate and effective disease risk prediction models. Our study highlighted that the AI-PRS model outperformed traditional PRS calculators in predicting CVD risk. Furthermore, using AI-based methods to calculate PRS may increase the precision of risk predictions for CVD and have significant ramifications for individualized prevention and treatment plans

    Genetics and genomics of aortic form and function

    Get PDF
    The thoracic aorta is a dynamic organ which adapts and remodels throughout life. Thoracic aortic size, shape and function are important contributors to both cardiovascular health and disease and risk of aortic disease. A complex interaction of environmental, genetic and haemodynamic factors is mediated by cells of the aortic wall. This thesis presents aortic phenotyping, genotyping and genome-wide associations of aortic traits in a large healthy cohort of 1218 volunteers. This is the largest study to report normal parameters for healthy thoracic aortic size, shape and function derived from cardiovascular magnetic resonance imaging. Anthropometric and cardiovascular risk factors such as age, gender, body fat mass and lipid profile are identified as significant determinants of aortic phenotype. The work suggests that cardiovascular risk factors could impair normal adaptive aortic remodelling with age. Genome-wide association studies of aortic dimensions and function identify new common variants, genes and pathways which could be important in aortic biology and cardiovascular risk. These include genes involved in cardiovascular development (eg PCDH7 and SON associated with aortic root diameter), autonomic cardiovascular responses (eg GABA receptor genes associated with aortic root diameter), fibrosis (eg ACTC1, AGTR1 associated with ascending aortic distensibility, BAMBI and MYOD associated with descending aortic distensibility) and obesity (eg ARID5B and IRX3 associated with aortic pulse wave velocity and ascending aortic area respectively). Multiple regulatory pathways including TGF-ß and IGF signalling (IGF1R, IGF2R), are identified which are associated with aortic dimensions and function. Joint trait analysis of aortic root dimensions identifies a new genome-wide significant association with TENM4, a key driver of early mesodermal development, and suggestive association with PTN, which is functionally related and plays a key role in angiogenesis. The primary analyses are complemented by exploratory assessment of rare genetic variation in bicuspid aortic valve (BAV) using panel sequencing in 177 patients. Rare variants might cause, or modify phenotype in BAV, but the clinical utility of panel sequencing remains poor. A further complementary study investigates the interaction of haemodynamics with aortic cellular phenotype, using microarray assessment of aortic endothelial cell transcriptomic response to shear stress pattern. Several genes of interest in atherosclerosis and aortic disease are differentially expressed with shear stress pattern, such as FABP4, ANGPT2, FILIP1, KIT, DCHS1, TGFBR3 and LOX. This work yields new insights into aortic phenotype, identifies key loci which might determine aortic traits and explores the complex interdependence of genetics, haemodynamics and environmental variables in aortic biology.Open Acces

    DeepWAS: Multivariate genotype-phenotype associations by directly integrating regulatory information using deep learning

    Get PDF
    Genome-wide association studies (GWAS) identify genetic variants associated with quantitative traits or disease. Thus, GWAS never directly link variants to regulatory mechanisms, which, in turn, are typically inferred during post-hoc analyses. In parallel, a recent deep learning-based method allows for prediction of regulatory effects per variant on currently up to 1,000 cell type-specific chromatin features. We here describe “DeepWAS”, a new approach that directly integrates predictions of these regulatory effects of single variants into a multivariate GWAS setting. As a result, single variants associated with a trait or disease are, by design, coupled to their impact on a chromatin feature in a cell type. Up to 40,000 regulatory single-nucleotide polymorphisms (SNPs) were associated with multiple sclerosis (MS, 4,888 cases and 10,395 controls), major depressive disorder (MDD, 1,475 cases and 2,144 controls), and height (5,974 individuals) to each identify 43-61 regulatory SNPs, called deepSNPs, which are shown to reach at least nominal significance in large GWAS. MS- and height-specific deepSNPs resided in active chromatin and introns, whereas MDD-specific deepSNPs located mostly to intragenic regions and repressive chromatin states. We found deepSNPs to be enriched in public or cohort-matched expression and methylation quantitative trait loci and demonstrate the potential of the DeepWAS method to directly generate testable functional hypotheses based on genotype data alone. DeepWAS is an innovative GWAS approach with the power to identify individual SNPs in non-coding regions with gene regulatory capacity with a joint contribution to disease risk. DeepWAS is available at https://github.com/cellmapslab/DeepWAS

    Molecular marker combinations preoperatively differentiate benign from malignant thyroid tumors

    Get PDF
    ABSTRACT Background. The initial presentation of thyroid carcinoma is through a nodule and the best way nowadays to evaluate it is by fine-needle aspiration (FNA). However many thyroid FNAs are not definitively benign or malignant, yielding an indeterminate or suspicious diagnosis which ranges from 10 to 25% of FNAs. The development of molecular initial diagnostic tests for evaluating a thyroid nodule is needed in order to define optimal surgical approach for patients with uncertain diagnosis pre- and intra-operatively. A large amount of information has been collected on the molecular tumorigenesis of thyroid cancer. A low expression of KIT gene has been reported during the transformation of normal thyroid epithelium to papillary carcinoma suggesting a possible role of the gene in the differentiation of thyroid tissue rather than in the proliferation. Moreover, several gene expression studies have shown differential gene expression signatures between malignant and benign thyroid tumors. The aim of the current study was to determine the diagnostic utility of a molecular assay based on the gene expression of a panel of molecular markers (KIT, SYNGR2, C21orf4, Hs.296031, DDI2, CDH1, LSM7, TC1, NATH) plus BRAF mutational status to distinguish benign from malignant thyroid neoplasm. Methods. The mRNA expression level of 9 genes (KIT, SYNGR2, C21orf4, Hs.296031, DDI2, CDH1, LSM7, TC1, NATH) was analyzed by quantitative Real-Time PCR (qPCR) in 93 FNA cytological samples. To evaluate the diagnostic utility of all the genes analyzed, we assessed the area under the curve (AUC) for each gene individually and in combination. BRAF exon 15 status was determined by capillary sequencing. A gene expression computational model (Neural Network Bayesian Classifier) was built and a multiple-variable analysis was then performed to analyze the correlation between the markers. Results. While looking at KIT expression, we have found a highly preferential decrease rather than increase in transcript of KIT in malignant thyroid lesions compared to the benign ones. To explore the diagnostic utility of KIT expression in thyroid nodules, its expression values were divided in four arbitrarily defined classes, with class I characterized by the complete silencing of the gene. Class I and IV represented the two most informative groups, with 100% of the samples found malignant or benign respectively. The molecular analysis was proven by ROC (receiver operating characteristic) analysis to be highly specific and sensitive improving the cytological diagnostic accuracy of 15%. The AUC for each significant marker was further assessed and ranged between 0.625 and 0.900, thus all the significant markers, alone and in combination, can be used to distinguish between malignant and benign FNA samples. The classifier made up of KIT, CDH1, LSM7, C21orf4, DDI2, TC1, Hs.296031 and BRAF had a predictive power of 88.8%. It proved to be useful for risk stratification of the most critical cytological group of the indeterminate lesions for which there is the greatest need for accurate diagnostic markers. Conclusion. The genetic classification obtained with such a model is highly accurate and may provide a tool to overcome the difficulties in today’s pre-operative diagnosis of thyroid malignancies

    Omic characterisation of placental development and phenotype

    Get PDF
    Gene expression is influenced by precise epigenetic mechanisms. In the context of pregnancy proper placental development and pregnancy outcome are dependent upon these mechanisms. These are poorly understood in the placenta and historically have not been investigated. In many biomedical research fields epigenetic modifications such as DNA methylation have been proven to be an effective biomarker. However, this has yet to be shown in the reproduction research field. The overall aim of this thesis was to investigate new epigenetic mechanisms in placental development and to identify novel biomarkers for phenotype prediction. This thesis firstly focuses on sex-biased gene expression in multiple human tissues to identify targets of sexual dimorphism. Secondly, it investigates novel transcripts in the placenta and finally focuses on using DNA methylation as a biomarker. Firstly, the research has identified potential new gene targets and mechanisms which may explain sexual dimorphism in many phenotypic traits and diseases. These results suggest that sex-biased gene expression is dynamic and tissue specific. It also highlights the need to consider sex as a biological variable in biomedical research and to address the lack of female representation in many studies. Secondly, by performing a de novo transcript analysis on the placenta this thesis has identified new non-coding RNAs. These placental transcripts were also found to be specific to the placenta and were differentially expressed across gestation and in preeclampsia compared to uncomplicated pregnancies. This suggests these transcripts may be involved in placental development and may have roles in the pathogenesis of preeclampsia. Identifying novel placenta specific transcripts has uncovered new research opportunities involving the placenta. There are potentially hundreds of other unannotated transcripts in the placenta which may have roles in placental development and may be crucial to a successful pregnancy outcome. Thirdly, using DNA methylation as a biomarker has led to the development of two key prediction models. The first one used the level of methylation at 62 cytosine-phosphate-guanosine (CpG) sites to determine the gestational age of a placenta. This computational tool was also used to identify placental aging in placentas from women with early onset preeclampsia. This tool points to potential mechanisms underpinning placental aging which may have an impact on pregnancy complications. The second prediction tool has identified 84 methylated sites in the methylome of maternal circulating leukocytes which can distinguish five pregnancy outcomes. This tool has potential clinical application to identify women at risk of a pregnancy complication. This would enable clinicians to intervene and potentially prevent or reduce morbidity and mortality for mother and child. In summary, this thesis has focused on sex differences in gene expression and DNA methylation in placental development. It has also shown that DNA methylation has potential as an effective biomarker in the field of reproduction research.Thesis (Ph.D.) (Research by Publication) -- University of Adelaide, Adelaide Medical School, 201

    Scalable Feature Selection Applications for Genome-Wide Association Studies of Complex Diseases

    Get PDF
    Personalized medicine will revolutionize our capabilities to combat disease. Working toward this goal, a fundamental task is the deciphering of geneticvariants that are predictive of complex diseases. Modern studies, in the formof genome-wide association studies (GWAS) have afforded researchers with the opportunity to reveal new genotype-phenotype relationships through the extensive scanning of genetic variants. These studies typically contain over half a million genetic features for thousands of individuals. Examining this with methods other than univariate statistics is a challenging task requiring advanced algorithms that are scalable to the genome-wide level. In the future, next-generation sequencing studies (NGS) will contain an even larger number of common and rare variants. Machine learning-based feature selection algorithms have been shown to have the ability to effectively create predictive models for various genotype-phenotype relationships. This work explores the problem of selecting genetic variant subsets that are the most predictive of complex disease phenotypes through various feature selection methodologies, including filter, wrapper and embedded algorithms. The examined machine learning algorithms were demonstrated to not only be effective at predicting the disease phenotypes, but also doing so efficiently through the use of computational shortcuts. While much of the work was able to be run on high-end desktops, some work was further extended so that it could be implemented on parallel computers helping to assure that they will also scale to the NGS data sets. Further, these studies analyzed the relationships between various feature selection methods and demonstrated the need for careful testing when selecting an algorithm. It was shown that there is no universally optimal algorithm for variant selection in GWAS, but rather methodologies need to be selected based on the desired outcome, such as the number of features to be included in the prediction model. It was also demonstrated that without proper model validation, for example using nested cross-validation, the models can result in overly-optimistic prediction accuracies and decreased generalization ability. It is through the implementation and application of machine learning methods that one can extract predictive genotype–phenotype relationships and biological insights from genetic data sets.Siirretty Doriast

    Biologically informed risk scoring in schizophrenia based on genome-wide omics data

    Get PDF
    Extensive efforts in characterizing the biological architecture of schizophrenia have moved psychiatric research closer towards clinical application. As our understanding of psychiatric illness is slowly shifting towards a conceptualization as dimensional constructs that cut across traditional diagnostic boundaries, opportunities for personalized medicine applications that are afforded by the application of advanced data science methods on the increasingly available, large-scale and multimodal data repositories are starting to be more broadly recognized. A particularly intriguing phenomenon is the discrepancy between the high heritability of schizophrenia and the difficulty in identifying predictive genetic signatures, for which polygenic risk scores of common variants that explain approximately 18% of illness-associated variance remain the gold standard. A substantial body of research points towards two lines of investigation that may lead to a significant advance, resolve at least in part the ‘missing heritability’ phenomenon, and potentially provide the basis for more predictive, personalized clinical tools. First, it is paramount to better understand the impact of environmental factors on illness risk and elucidate the biology underlying their impact on altered brain function in schizophrenia. This thesis aims to close a major gap in our understanding of the multivariate, epigenetic landscape associated with schizophrenia, its interaction with polygenic risk and its association with DLPFC-HC connectivity, a well-established and robust neural intermediate phenotype of schizophrenia. As a basis for this, we have developed a novel biologically-informed machine learning framework by incorporating systems-level biological domain knowledge, i.e., gene ontological pathways, entitled ‘BioMM’ using genome-wide DNA methylation data obtained from whole blood samples. An epigenetic poly-methylation score termed ‘PMS’ was estimated at the individual level using BioMM, trained and validated using a total of 2230 whole-blood samples and 244 post-mortem brain samples. The pathways contributing most to this PMS were strongly associated with synaptic, neural and immune system-related functions. The identified PMS could be successfully validated in two independent cohorts, demonstrating the robust generalizability of the identified model. Furthermore, the PMS could significantly differentiate patients with schizophrenia from healthy controls when predicted in DLPFC post-mortem brain samples, suggesting that the epigenetic landscape of schizophrenia is to a certain extent shared between the central and peripheral tissues. Importantly, the peripheral PMS was associated with an intermediate neuroimaging phenotype (i.e., DLPFC-HC functional connectivity) in two independent imaging samples under the working memory paradigm. However, we did not find sufficient evidence for a combined genetic and epigenetic effect on brain function by integrating PRS derived from GWAS data, which suggested that DLPFC-HC coupling was predominantly impacted by environmental risk components, rather than polygenic risk of common variants. The epigenetic signature was further not associated with GWAS-derived risk scores implying the observed epigenetic effect did likely not depend on the underlying genetics, and this was further substantiated by investigation of data from unaffected first-degree relatives of patients with SCZ, BD, MDD and autism. In summary, the characterization of PMS through the systems-level integration of multimodal data elucidates the multivariate impact of epigenetic effects on schizophrenia-relevant brain function and its interdependence with genetic illness risk. Second, the limited predictive value of polygenic risk scores and the difficulty in identifying associations with heritable neural differences found in schizophrenia may be due to the possibility that the manifestation of the functional consequences of genetic risk is modulated by spatio-temporal as well as sex-specific effects. To address this, this thesis identifies sex-differences in the spatio-temporal expression trajectories during human development of genes that showed significant prefrontal co-expression with schizophrenia risk genes during the fetal phase and adolescence, consistent with a core developmental hypothesis of schizophrenia. More specifically, it was found that during these two time-periods, prefrontal expression was significantly more variable in males compared to females, a finding that could be validated in an independent data source and that was specific for schizophrenia compared to other psychiatric as well as somatic illnesses. Similar to the epigenetic differences described above, the genes underlying the risk-associated gene expression differences were significantly linked to synaptic function. Notably, individual genes with male-specific variability increases were distinct between the fetal phase and adolescence, potentially suggesting different risk associated mechanisms that converge on the shared synaptic involvement of these genes. These results provide substantial support to the hypothesis that the functional consequences of genetic risk show spatiotemporal specificity. Importantly, the temporal specificity was linked to the fetal phase and adolescence, time-periods that are thought to be of predominant importance for the brain-functional consequences of environmental risk exposure. Therefore, the presented results provide the basis for future studies exploring the polygenic risk architecture and its interaction with environmental effects in a multivariate and spatiotemporally stratified manner. In summary, the work presented in this thesis describes multivariate, multimodal approaches to characterize the (epi-)genetic basis of schizophrenia, explores its association with a well-established neural intermediate phenotype of the illness and investigates the spatio-temporal specificity of schizophrenia-relevant gene expression effects. This work expands our knowledge of the complex biology underlying schizophrenia and provides the basis for the future development of more predictive biological algorithms that may aid in advancing personalized medicine in psychiatry
    corecore