40 research outputs found

    Identification of a novel clinical phenotype of severe malaria using a network-based clustering approach

    Get PDF
    The parasite Plasmodium falciparum is the main cause of severe malaria (SM). Despite treatment with antimalarial drugs, more than 400,000 deaths are reported every year, mainly in African children. The diversity of clinical presentations associated with SM highlights important differences in disease pathogenesis that often require specific therapeutic options. The clinical heterogeneity of SM is largely unresolved. Here we report a network-based analysis of clinical phenotypes associated with SM in 2,915 Gambian children admitted to hospital with Plasmodium falciparum malaria. We used a network-based clustering method which revealed a strong correlation between disease heterogeneity and mortality. The analysis identified four distinct clusters of SM and respiratory distress that departed from the WHO definition. Patients in these clusters characteristically presented with liver enlargement and high concentrations of brain natriuretic peptide (BNP), giving support to the potential role of circulatory overload and/or right-sided heart failure as a mechanism of disease. The role of heart failure is controversial in SM and our work suggests that standard clinical management may not be appropriate. We find that our clustering can be a powerful data exploration tool to identify novel disease phenotypes and therapeutic options to reduce malaria-associated mortality

    Evaluation of Lymphocyte Response to the Induced Oxidative Stress in a Cohort of Ageing Subjects, including Semisupercentenarians and Their Offspring

    Get PDF
    The production of reactive oxygen species (ROS) may promote immunosenescence if not counterbalanced by the antioxidant systems. Cell membranes, proteins, and nucleic acids become the target of ROS and progressively lose their structure and functions. This process could lead to an impairment of the immune response. However, little is known about the capability of the immune cells of elderly individuals to dynamically counteract the oxidative stress. Here, the response of the main lymphocyte subsets to the induced oxidative stress in semisupercentenarians (CENT), their offspring (OFF), elderly controls (CTRL), and young individuals (YO) was analyzed using flow cytometry. The results showed that the ratio of the ROS levels between the induced and noninduced (I/NI) oxidative stress conditions was higher in CTRL and OFF than in CENT and YO, in almost all T, B, and NK subsets. Moreover, the ratio of reduced glutathione levels between I/NI conditions was higher in OFF and CENT compared to the other groups in almost all the subsets. Finally, we observed significant correlations between the response to the induced oxidative stress and the degree of methylation in specific genes on the oxidative stress pathway. Globally, these data suggest that the capability to buffer dynamic changes in the oxidative environment could be a hallmark of longevity in humans

    Consensus Clustering of temporal profiles for the identification of metabolic markers of pre-diabetes in childhood (EarlyBird 73)

    Get PDF
    In longitudinal clinical studies, methodologies available for the analysis of multivariate data with multivariate methods are relatively limited. Here, we present Consensus Clustering (CClust) a new computational method based on clustering of time pro les and posterior identi cation of correlation between clusters and predictors. Subjects are rst clustered in groups according to a response variable temporal pro le, using a robust consensus-based strategy. To discover which of the remaining variables are associated with the resulting groups, a non-parametric hypothesis test is performed between groups at every time point, and then the results are aggregated according to the Fisher method. Our approach is tested through its application to the EarlyBird cohort database, which contains temporal variations of clinical, metabolic, and anthropometric pro les in a population of 150 children followed-up annually from age 5 to age 16. Our results show that our consensus-based method is able to overcome the problem of the approach-dependent results produced by current clustering algorithms, producing groups de ned according to Insulin Resistance (IR) and biological age (Tanner Score). Moreover, it provides meaningful biological results con rmed by hypothesis testing with most of the main clinical variables. These results position CClust as a valid alternative for the analysis of multivariate longitudinal data

    Precision identification of high-risk phenotypes and progression pathways in severe malaria without requiring longitudinal data

    Get PDF
    More than 400,000 deaths from severe malaria (SM) are reported every year, mainly in African children. The diversity of clinical presentations associated with SM indicates important differences in disease pathogenesis that require specific treatment, and this clinical heterogeneity of SM remains poorly understood. Here, we apply tools from machine learning and model-based inference to harness large-scale data and dissect the heterogeneity in patterns of clinical features associated with SM in 2904 Gambian children admitted to hospital with malaria. This quantitative analysis reveals features predicting the severity of individual patient outcomes, and the dynamic pathways of SM progression, notably inferred without requiring longitudinal observations. Bayesian inference of these pathways allows us assign quantitative mortality risks to individual patients. By independently surveying expert practitioners, we show that this data-driven approach agrees with and expands the current state of knowledge on malaria progression, while simultaneously providing a data-supported framework for predicting clinical risk

    DifFUZZY: a novel clustering algorithm for systems biology

    No full text
    Current studies of the highly complex pathobiology and molecular signatures of human disease require the analysis of large sets of high-throughput data, from clinical to genetic expression experiments, containing a wide range of information types. A number of computational techniques are used to analyse such high-dimensional bioinformatics data.In this thesis we focus on the development of a novel soft clustering technique, DifFUZZY, a fuzzy clustering algorithm applicable to a larger class of problems than other soft clustering approaches. This method is better at handling datasets that contain clusters that are curved, elongated or are of different dispersion. We show how DifFUZZY outperforms a number of frequently used clustering algorithms using a number of examples of synthetic and real datasets. Furthermore, a quality measure based on the diffusion distance developed for DifFUZZY is presented, which is employed to automate the choice of its main parameter.We later apply DifFUZZY and other techniques to data from a clinical study of children from The Gambia with different types of severe malaria. The first step was to identify the most informative features in the dataset which allowed us to separate the different groups of patients. This led to us reproducing the World Health Organisation classification for severe malaria syndromes and obtaining a reduced dataset for further analysis. In order to validate these features as relevant for malaria across the continent and not only in The Gambia, we used a larger dataset for children from different sites in Sub-Saharan Africa. With the use of a novel network visualisation algorithm, we identified pathobiological clusters from which we made and subsequently verified clinical hypotheses.We finish by presenting conclusions and future directions, including image segmentation and clustering time-series data. We also suggest how we could bridge data modelling with bioinformatics by embedding microarray data into cell models. Towards this end we take as a case study a multiscale model of the intestinal crypt using a cell-vertex model.</p

    Comprehensive and Scalable Highly Automated MS-Based Proteomic Workflow for Clinical Biomarker Discovery in Human Plasma

    No full text
    Over the past decade, mass spectrometric performance has greatly improved in terms of sensitivity, dynamic range, and speed. By contrast, only limited progress has been accomplished with regard to automation, throughput, and robustness of the proteomic sample preparation process upstream of mass spectrometry. The present work delivers an optimized analysis of human plasma samples in both small preclinical and large clinical studies, enabled by the development of a highly automated quantitative proteomic workflow. Several iterative evaluation and validation steps were performed before process "design freeze" and development completion. A robotic liquid handling workflow and platform (including reduction, alkylation, digestion, TMT labeling, pooling, and purification) were shown to provide better quantitative trueness and precision than manual operation at the bench. Depletion of the most abundant human plasma proteins and subsequent buffer exchange were also developed and integrated. Finally, 96 identical pooled human plasma samples were prepared in a 96-well plate format, and each sample was individually subjected to our developed workflow. This test revealed increased throughput and robustness compared with to-date published manual or less automated workflows. Our workflow is ready-to-use for future (pre-) clinical studies. We expect our work to facilitate, accelerate, and improve clinical proteomic discovery in human blood plasma

    DifFUZZY : a novel clustering algorithm for systems biology

    No full text
    Current studies of the highly complex pathobiology and molecular signatures of human disease require the analysis of large sets of high-throughput data, from clinical to genetic expression experiments, containing a wide range of information types. A number of computational techniques are used to analyse such high-dimensional bioinformatics data. In this thesis we focus on the development of a novel soft clustering technique, DifFUZZY, a fuzzy clustering algorithm applicable to a larger class of problems than other soft clustering approaches. This method is better at handling datasets that contain clusters that are curved, elongated or are of different dispersion. We show how DifFUZZY outperforms a number of frequently used clustering algorithms using a number of examples of synthetic and real datasets. Furthermore, a quality measure based on the diffusion distance developed for DifFUZZY is presented, which is employed to automate the choice of its main parameter. We later apply DifFUZZY and other techniques to data from a clinical study of children from The Gambia with different types of severe malaria. The first step was to identify the most informative features in the dataset which allowed us to separate the different groups of patients. This led to us reproducing the World Health Organisation classification for severe malaria syndromes and obtaining a reduced dataset for further analysis. In order to validate these features as relevant for malaria across the continent and not only in The Gambia, we used a larger dataset for children from different sites in Sub-Saharan Africa. With the use of a novel network visualisation algorithm, we identified pathobiological clusters from which we made and subsequently verified clinical hypotheses. We finish by presenting conclusions and future directions, including image segmentation and clustering time-series data. We also suggest how we could bridge data modelling with bioinformatics by embedding microarray data into cell models. Towards this end we take as a case study a multiscale model of the intestinal crypt using a cell-vertex model.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    DifFUZZY: A fuzzy spectral clustering algorithm for complex data sets

    No full text
    Motivation: Soft (fuzzy) clustering techniques are often used in the study of high-dimensional data sets, such as microarray and other high-throughput bioinformatics data. The most widely used method is the Fuzzy C-means algorithm (FCM), but it can present difficulties when dealing with some data sets. Results: A spectral fuzzy clustering algorithm, DifFUZZY, applicable to a larger class of clustering problems than other fuzzy clustering algorithms is developed. Examples of data sets (synthetic and real)for which this method outperforms other frequently used algorithms are presented, including two benchmark biological data sets, a genetic expression data set and a data set that contains taxonomic measurements. This method is better than traditional fuzzy clustering algorithms at handling data sets that are “curved”, elongated or those which contain clusters of different dispersion
    corecore