69 research outputs found

    Correlates of the molecular vaginal microbiota composition of African women.

    Get PDF
    BACKGROUND: Sociodemographic, behavioral and clinical correlates of the vaginal microbiome (VMB) as characterized by molecular methods have not been adequately studied. VMB dominated by bacteria other than lactobacilli may cause inflammation, which may facilitate HIV acquisition and other adverse reproductive health outcomes. METHODS: We characterized the VMB of women in Kenya, Rwanda, South Africa and Tanzania (KRST) using a 16S rDNA phylogenetic microarray. Cytokines were quantified in cervicovaginal lavages. Potential sociodemographic, behavioral, and clinical correlates were also evaluated. RESULTS: Three hundred thirteen samples from 230 women were available for analysis. Five VMB clusters were identified: one cluster each dominated by Lactobacillus crispatus (KRST-I) and L. iners (KRST-II), and three clusters not dominated by a single species but containing multiple (facultative) anaerobes (KRST-III/IV/V). Women in clusters KRST-I and II had lower mean concentrations of interleukin (IL)-1α (p < 0.001) and Granulocyte Colony Stimulating Factor (G-CSF) (p = 0.01), but higher concentrations of interferon-γ-induced protein (IP-10) (p < 0.01) than women in clusters KRST-III/IV/V. A lower proportion of women in cluster KRST-I tested positive for bacterial sexually transmitted infections (STIs; ptrend = 0.07) and urinary tract infection (UTI; p = 0.06), and a higher proportion of women in clusters KRST-I and II had vaginal candidiasis (ptrend = 0.09), but these associations did not reach statistical significance. Women who reported unusual vaginal discharge were more likely to belong to clusters KRST-III/IV/V (p = 0.05). CONCLUSION: Vaginal dysbiosis in African women was significantly associated with vaginal inflammation; the associations with increased prevalence of STIs and UTI, and decreased prevalence of vaginal candidiasis, should be confirmed in larger studies

    Improvement of Insulin Sensitivity after Lean Donor Feces in Metabolic Syndrome Is Driven by Baseline Intestinal Microbiota Composition

    Get PDF
    The intestinal microbiota has been implicated in insulin resistance, although evidence regarding causality in humans is scarce. We therefore studied the effect of lean donor (allogenic) versus own (autologous) fecal microbiota transplantation (FMT) to male recipients with the metabolic syndrome. Whereas we did not observe metabolic changes at 18 weeks after FMT, insulin sensitivity at 6 weeks after allogenic FMT was significantly improved, accompanied by altered microbiota composition. We also observed changes in plasma metabolites such as gamma-aminobutyric acid and show that metabolic response upon allogenic FMT (defined as improved insulin sensitivity 6 weeks after FMT) is dependent on decreased fecal microbial diversity at baseline. In conclusion, the beneficial effects of lean donor FMT on glucose metabolism are associated with changes in intestinal microbiota and plasma metabolites and can be predicted based on baseline fecal microbiota composition.Peer reviewe

    A Statistical Framework For Nutriomics Data Analysis

    Get PDF
    Nutriomics is a new discipline that investigates the relationship between nutrition and health through the use of high throughput omics technologies. However, the inherent complexity of nutriomics data poses several challenges for data analysis. In this thesis, the author introduces nutriomics and the statistical challenges associated with its analysis. They propose statistical modelling and machine learning methods to tackle three main challenges: non-linearity, high dimensionality, and data heterogeneity. To deal with these challenges, we first propose a statistical framework, that we coin LC-N2G, to test whether the association between nutrition intake and omics features of interest are significantly different from being unrelated. We use public data as an example to show LC-N2G's ability to discover non-linear associations between nutrition and gene expression. Then we propose a statistical method, coined eNODAL, to cluster high-dimensional omics features based on how they respond to nutrition intake. The application of eNODAL to a mouse proteomics nutrition study shows that eNODAL can identify interpretable clusters of proteins with similar responses to diet and drug treatment. Finally, a statistical model, which we call NEMoE, is proposed to uncover the heterogeneous interplay among diet, omics, and health outcomes. We use a microbiome Parkinson’s disease (PD) study to illustrate the method and show that NEMoE is able to identify diet-specific microbial signatures of PD. Overall, this thesis proposes statistical methods to analyze nutriomics data and provides possible future extensions based on the research. The methods proposed in this thesis could help researchers better understand the complex relationships between nutrition and health, ultimately leading to improved health outcomes

    Unsupervised approaches for time-evolving graph embeddings with application to human microbiome

    Get PDF
    More and more diseases have been found to be strongly correlated with disturbances in the microbiome constitution, e.g., obesity, diabetes, and even some types of cancer. Advances in high-throughput omics technologies have made it possible to directly analyze the human microbiome and its impact on human health and physiology. Microbial composition is usually observed over long periods of time and the interactions between their members are explored. Numerous studies have used microbiome data to accurately differentiate disease states and understand the differences in microbiome profiles between healthy and ill individuals. However, most of them mainly focus on various statistical approaches, omitting microbe-microbe interactions among a large number of microbiome species that, in principle, drive microbiome dynamics. Constructing and analyzing time-evolving graphs is needed to understand how microbial ecosystems respond to a range of distinct perturbations, such as antibiotic exposure, diseases, or other general dynamic properties. This becomes especially challenging due to dozens of complex interactions among microbes and metastable dynamics. The key to addressing this challenge lies in representing time-evolving graphs constructed from microbiome data as fixed-length, low-dimensional feature vectors that preserve the original dynamics. Therefore, we propose two unsupervised approaches that map the time-evolving graph constructed from microbiome data into a low-dimensional space where the initial dynamic, such as the number of metastable states and their locations, is preserved. The first method relies on the spectral analysis of transfer operators, such as the Perron--Frobenius or Koopman operator, and graph kernels. These components enable us to extract topological information such as complex interactions of species from the time-evolving graph and take into account the dynamic changes in the human microbiome composition. Further, we study how deep learning techniques can contribute to the study of a complex network of microbial species. The method consists of two key components: 1) the Transformer, the state-of-the-art architecture used in the sequential data, that learns both structural patterns of the time-evolving graph and temporal changes of the microbiome system and 2) contrastive learning that allows the model to learn the low-dimensional representation while maintaining metastability in a low-dimensional space. Finally, this thesis will address an important challenge in microbiome data, specifically identifying which species or interactions of species are responsible for or affected by the changes that the microbiome undergoes from one state (healthy) to another state (diseased or antibiotic exposure). Using interpretability techniques of deep learning models, which, at the outset, have been used as methods to prove the trustworthiness of a deep learning model, we can extract structural information of the time-evolving graph pertaining to particular metastable states

    Adapting Community Detection Approaches to Large, Multilayer, and Attributed Networks

    Get PDF
    Networks have become a common data mining tool to encode relational definitions between a set of entities. Whether studying biological correlations, or communication between individuals in a social network, network analysis tools enable interpretation, prediction, and visualization of patterns in the data. Community detection is a well-developed subfield of network analysis, where the objective is to cluster nodes into 'communities' based on their connectivity patterns. There are many useful and robust approaches for identifying communities in a single, moderately-sized network, but the ability to work with more complicated types of networks containing extra or a large amount of information poses challenges. In this thesis, we address three types of challenging network data and how to adapt standard community detection approaches to handle these situations. In particular, we focus on networks that are large, attributed, and multilayer. First, we present a method for identifying communities in multilayer networks, where there exist multiple relational definitions between a set of nodes. Next, we provide a pre-processing technique for reducing the size of large networks, where standard community detection approaches might have inconsistent results or be prohibitively slow. We then introduce an extension to a probabilistic model for community structure to take into account node attribute information and develop a test to quantify the extent to which connectivity and attribute information align. Finally, we demonstrate example applications of these methods in biological and social networks. This work helps to advance the understand of network clustering, network compression, and the joint modeling of node attributes and network connectivity.Doctor of Philosoph

    Rigid Transformations for Stabilized Lower Dimensional Space to Support Subsurface Uncertainty Quantification and Interpretation

    Full text link
    Subsurface datasets inherently possess big data characteristics such as vast volume, diverse features, and high sampling speeds, further compounded by the curse of dimensionality from various physical, engineering, and geological inputs. Among the existing dimensionality reduction (DR) methods, nonlinear dimensionality reduction (NDR) methods, especially Metric-multidimensional scaling (MDS), are preferred for subsurface datasets due to their inherent complexity. While MDS retains intrinsic data structure and quantifies uncertainty, its limitations include unstabilized unique solutions invariant to Euclidean transformations and an absence of out-of-sample points (OOSP) extension. To enhance subsurface inferential and machine learning workflows, datasets must be transformed into stable, reduced-dimension representations that accommodate OOSP. Our solution employs rigid transformations for a stabilized Euclidean invariant representation for LDS. By computing an MDS input dissimilarity matrix, and applying rigid transformations on multiple realizations, we ensure transformation invariance and integrate OOSP. This process leverages a convex hull algorithm and incorporates loss function and normalized stress for distortion quantification. We validate our approach with synthetic data, varying distance metrics, and real-world wells from the Duvernay Formation. Results confirm our method's efficacy in achieving consistent LDS representations. Furthermore, our proposed "stress ratio" (SR) metric provides insight into uncertainty, beneficial for model adjustments and inferential analysis. Consequently, our workflow promises enhanced repeatability and comparability in NDR for subsurface energy resource engineering and associated big data workflows.Comment: 30 pages, 17 figures, Submitted to Computational Geosciences Journa

    A primer on machine learning techniques for genomic applications

    Get PDF
    High throughput sequencing technologies have enabled the study of complex biological aspects at single nucleotide resolution, opening the big data era. The analysis of large volumes of heterogeneous “omic” data, however, requires novel and efficient computational algorithms based on the paradigm of Artificial Intelligence. In the present review, we introduce and describe the most common machine learning methodologies, and lately deep learning, applied to a variety of genomics tasks, trying to emphasize capabilities, strengths and limitations through a simple and intuitive language. We highlight the power of the machine learning approach in handling big data by means of a real life example, and underline how described methods could be relevant in all cases in which large amounts of multimodal genomic data are available

    Predicting Urban Heat Island Mitigation with Random Forest Regression in Belgian Cities

    Full text link
    peer reviewedAn abundance of impervious surfaces like building roofs in densely populated cities make green roofs a suitable solution for urban heat island (UHI) mitigation. Therefore, we employ random forest (RF) regression to predict the impact of green roofs on the surface UHI (SUHI) in Liege, Belgium. While there have been several studies identifying the impact of green roofs on UHI, fewer studies utilize a remote-sensing-based approach to measure impact on Land Surface Temperatures (LST) that are used to estimate SUHI. Moreover, the RF algorithm, can provide useful insights. In this study, we use LST obtained from Landsat-8 imagery and relate it to 2D and 3D morphological parameters that influence LST and UHI effects. Additionally, we utilise parameters that influence wind (e.g., frontal area index). We simulate the green roofs by assigning suitable values of normalised difference-vegetation index and built-up index to the buildings with flat roofs. Results suggest that green roofs decrease the average LST
    corecore