5 research outputs found

    Diverse Convergent Evidence in the Genetic Analysis of Complex Disease: Coordinating Omic, Informatic, and Experimental Evidence to Better Identify and Validate Risk Factors

    Get PDF
    In omic research, such as genome wide association studies, researchers seek to repeat their results in other datasets to reduce false positive findings and thus provide evidence for the existence of true associations. Unfortunately this standard validation approach cannot completely eliminate false positive conclusions, and it can also mask many true associations that might otherwise advance our understanding of pathology. These issues beg the question: How can we increase the amount of knowledge gained from high throughput genetic data? To address this challenge, we present an approach that complements standard statistical validation methods by drawing attention to both potential false negative and false positive conclusions, as well as providing broad information for directing future research. The Diverse Convergent Evidence approach (DiCE) we propose integrates information from multiple sources (omics, informatics, and laboratory experiments) to estimate the strength of the available corroborating evidence supporting a given association. This process is designed to yield an evidence metric that has utility when etiologic heterogeneity, variable risk factor frequencies, and a variety of observational data imperfections might lead to false conclusions. We provide proof of principle examples in which DiCE identified strong evidence for associations that have established biological importance, when standard validation methods alone did not provide support. If used as an adjunct to standard validation methods this approach can leverage multiple distinct data types to improve genetic risk factor discovery/validation, promote effective science communication, and guide future research directions

    Predicting lifespan-extending chemical compounds for C. elegans with machine learning and biologically interpretable features

    Get PDF
    Recently, there has been a growing interest in the development of pharmacological interventions targeting ageing, as well as in the use of machine learning for analysing ageing-related data. In this work, we use machine learning methods to analyse data from DrugAge, a database of chemical compounds (including drugs) modulating lifespan in model organisms. To this end, we created four types of datasets for predicting whether or not a compound extends the lifespan of C. elegans (the most frequent model organism in DrugAge), using four different types of predictive biological features, based on: compound-protein interactions, interactions between compounds and proteins encoded by ageing-related genes, and two types of terms annotated for proteins targeted by the compounds, namely Gene Ontology (GO) terms and physiology terms from the WormBase’s Phenotype Ontology. To analyse these datasets, we used a combination of feature selection methods in a data pre-processing phase and the well-established random forest algorithm for learning predictive models from the selected features. In addition, we interpreted the most important features in the two best models in light of the biology of ageing. One noteworthy feature was the GO term “Glutathione metabolic process”, which plays an important role in cellular redox homeostasis and detoxification. We also predicted the most promising novel compounds for extending lifespan from a list of previously unlabelled compounds. These include nitroprusside, which is used as an antihypertensive medication. Overall, our work opens avenues for future work in employing machine learning to predict novel life-extending compounds

    Neuroimmune interactions related to development of affective behavioural disturbances in neuropathic pain states

    Get PDF
    Nerve damage leads to the development of disabling neuropathic pain in susceptible individuals, where patients present with pain as well as co-morbid behavioural changes, such as anhedonia, decreased motivation and depression. The pathophysiology of neuropathic pain remains unknown, however accumulating evidence suggests that neuroimmune interactions play a key role in its pathogenesis and development of co-morbid behavioural disturbances. Complex regional pain syndrome (CRPS) is a debilitating neuropathic disorder where trauma to a limb results in chronic pain. Mass cytometry (CyTOF) was used to systematically analyse circulating immune cells with a panel of 38 phenotypic and activation markers in the blood of CRPS patients and healthy controls. CyTOF revealed an expansion and increased activation of signalling pathways in several distinct populations of central memory CD8+ and CD4+ T lymphocytes. Regarding emotional state, CD8+ T lymphocytes were correlated with clinical scores for stress and CD4+ Th1 lymphocytes correlated with clinical scores for anxiety. There was also a reduction in circulating Dendritic cells (DC), indicative of DC tissue trafficking and potential involvement in lymphocyte activation. These data highlight a pathogenic role for T lymphocyte mediated chronic inflammation in CRPS and co-morbid behavioural disabilities. To further explore to role of neuroimmune interactions in the development of neuropathic pain and co-morbid behavioural changes, a rodent nerve injury model was utilized to evaluate whether individual differences in radial maze behaviour and neuroimmune interactions in the hippocampus (HP) and medial prefrontal cortex (mPFC) occurred in rats after sciatic nerve chronic constriction injury (CCI). CCI reduced mechanical withdrawal thresholds in all rats, whilst pellet-seeking behaviours were altered in some but not all rats. One group, termed ‘No effect’, had no behavioural changes compared to sham rats. Another group, termed ‘Acute effect’, had a temporary alteration to their exploration pattern, displaying more risk-assessment behaviour in the early phase post-injury. In a third group, termed ‘Lasting effect’, exploratory behaviours were remarkably different for the entire post-injury period, showing a withdrawal from pellet-seeking. Immunohistochemical analysis throughout the dorso-ventral axis of the HP revealed that the withdrawal from pellet-seeking observed in Lasting effect rats was concomitant with distinct glial-cytokine-neuronal adaptations within the contralateral ventral HP, including; increased expression of IL-1b and MCP-1; astrocyte atrophy and decreased area in the dentate gyrus (DG); reactive microglia and increased FosB/DFosB expression in the cornu ammonis (CA) subfield. These data highlight that glial-cytokine-neuronal adaptations in the ventral HP may mediate individual differences in radial maze behaviour following CCI. A follow up experiment explored whether pre-injury learning on the maze altered the effects of nerve injury on exploratory behaviour and spatial memory function. Whilst CCI again produced three distinct patterns of behaviour on the radial maze, Acute effect rats had improved working spatial memory outcomes after CCI. This indicates that the increased risk-assessment behaviours employed by Acute effect rats after injury may be considered advantageous when pellet-seeking, as it reduces unnecessary exploration during reward-seeking. The behavioural disruptions observed in Lasting effect rats were accompanied by neuroimmune activation within the contralateral ventral HP and mPFC. Multiplex immunoassay analysis revealed an increase in IL-1b, IL-6 and MCP-1 within the contralateral mPFC and ventral HP. Detailed immunohistochemical analysis of the mPFC and HP revealed an increased expression of IL-6, increased phospho-p38 MAPK expression in neurons and microglia, and a shift to a reactive microglial morphology in the caudal prelimbic and infralimbic cortex, ventral CA1 and DG. There was also a reduction in astrocyte cell size and BDNF expression in the contralateral ventral DG. These data provide further evidence that neuroinflammation in the mPFC and ventral HP may influence individual differences in radial maze behaviour following CCI. Collectively, these data provide evidence that individual differences in circulating immune cell activation and neuroimmune signature in the interconnected ventral HP-mPFC circuitry may play a significant role in the divergent behavioural trajectories in the neuropathic pain state, contributing to co-morbid behavioural changes in susceptible individuals

    Pareto optimal-based feature selection framework for biomarker identification

    Get PDF
    Numerous computational techniques have been applied to identify the vital features of gene expression datasets in aiming to increase the efficiency of biomedical applications. The classification of microarray data samples is an important task to correctly recognise diseases by identifying small but clinically meaningful genes. However, identification of disease representative genes or biomarkers in high dimensional microarray gene-expression datasets remains a challenging task. This thesis investigates the viability of Pareto optimisation in identifying relevant subsets of biomarkers in high-dimensional microarray datasets. A robust Pareto Optimal based feature selection framework for biomarker discovery is then proposed. First, a two-stage feature selection approach using ensemble filter methods and Pareto Optimality is proposed. The integration of the multi-objective approach employing Pareto Optimality starts with well-known filter methods applied to various microarray gene-expression datasets. Although filter methods provide ranked lists of features, they do not give information about optimum subsets of features, which are namely genes in this study. To address this limitation, the Pareto Optimality is incorporated along with filter methods. The robustness of the proposed framework is successfully demonstrated on several well-known microarray gene expression datasets and it is shown to achieve comparable or up to 100% predictive accuracy with comparatively fewer features. Better performance results are obtained in comparison with other approaches, which are single-objective approaches. Furthermore, cross-validation and k-fold approaches are integrated into the framework, which can enhance the over-fitting problem and the gene selection process is subsequently more accurate under various conditions. Then the proposed framework is developed in several phases. The Sequential Forward Selection method (SFS) is first used to represent wrapper techniques, and the developed Pareto Optimality based framework is applied multiple times and tested on different data types. Given the nature of most real-life data, imbalanced classes are examined using the proposed framework. The classifier achieves high performance at a similar level of different cases using the proposed Pareto Optimal based feature selection framework, which has a novel structure for imbalanced classes. Comparable or better gene subset sizes are obtained using the proposed framework. Finally, handling missing data within the proposed framework is investigated and it is demonstrated that different data imputation methods can also help in the effective integration of various feature selection methods

    Computational methods for breath metabolomics in clinical diagnostics

    Get PDF
    For a long time, human odors and vapors have been known for their diagnostic power. Therefore, the analysis of the metabolic composition of human breath and odors creates the opportunity for a non-invasive tool for clinical diagnostics. Innovative analytical technologies to capture the metabolic profile of a patient’s breath are available, such as, for instance, the ion mobility spectrometry coupled to a multicapilary collumn. However, we are lacking automated systems to process, analyse and evaluate large clinical studies of the human exhaled air. To fill this gap, a number of computational challenges need to be addressed. For instance, breath studies generate large amounts of heterogeneous data that requires automated preprocessing, peak-detection and identification as a basis for a sophisticated follow up analysis. In addition, generalizable statistical evaluation frameworks for the detection of breath biomarker profiles that are robust enough to be employed in routine clinical practice are necessary. In particular since breath metabolomics is susceptible to specific confounding factors and background noise, similar to other clinical diagnostics technologies. Moreover, spesific manifestations of disease stages and progression, may largely influence the breathomics profiles. To this end, this thesis will address these challenges to move towards more automatization and generalization in clinical breath research. In particular I present methods to support the search for biomarker profiles that enable a non-invasive detection of diseases, treatment optimization and prognosis to provide a new powerful tool for precision medicine.Seit jeher ist bekannt, dass Körpergeruch und der Atem Hinweise zu deren Gesundheitszustand liefern können. Eine Analyse der Atemluft auf molekularer Ebene verspricht daher neue AnsĂ€tze zur Diagnose spezifischer Krankheiten. Innovative Technologien wie die Ionen MobilitĂ€ts Spectrometrie in Kombination mit einer MultikapilarsĂ€ule, erlauben erstmals hochauflösende metabolische Profile der Atemluft innerhalb kĂŒrzester Zeit zu erzeugen. Zur Zeit fehlen jedoch die notwendigen computergestĂŒtzten Applikationen zur automatischen Organisation und Auswertung der generierten Daten. Eine besondere Herausforderung stellen dabei die großen Mengen heterogenener klinischer und analytischer Daten und deren Verarbeitung. Ähnlich wie andere Hochdurchsatzverfahren unterliegt die Atemluft dem Einfluss von Hintergrundsignalen wie der Umgebungsluft oder Anderen die Ergebnisse verzerrenden Faktoren, wie zum Beispiel ErnĂ€hrung, Lebensgewohnheiten oder Medikation. Dies erfordert den Einsatz von modernen Methoden der Statistik und des maschinellen Lernens, um robuste und generalisierbare Krankheitsmarker zu identifizieren. Ein besonderer Augenmerk gilt hierbei auch Krankheiten deren metabolischer Fingerabdruck sich im Krankheitsverlauf drastisch verĂ€ndern können. Das Ziel meiner Arbeit ist es Lösungen fĂŒr die beschriebenen Probleme zu finden und damit die Suche nach praxistauglichen Krankheitsmarkern mit bioinformatischen Methoden zu unterstĂŒtzen. Im Rahmen mehrerer Studien und Softwareprojekten wurden grundlegende Methodiken vorgestellt, evaluiert und etabliert, insbesondere im Hinblick auf die Entwicklung computergestĂŒtzter Systeme zur automatischen Analyse von Atemluftdaten. Die vorgestellten Verfahren legen den Grundstein fĂŒr die nicht invasive Detektion von Krankheiten, Optimierung und Prognose von Behandlungen und darĂŒber hinaus fĂŒr ein weiteres Werkzeug der personalisierten Medizin
    corecore