2,099 research outputs found

    Software Tools and Approaches for Compound Identification of LC-MS/MS Data in Metabolomics.

    Get PDF
    The annotation of small molecules remains a major challenge in untargeted mass spectrometry-based metabolomics. We here critically discuss structured elucidation approaches and software that are designed to help during the annotation of unknown compounds. Only by elucidating unknown metabolites first is it possible to biologically interpret complex systems, to map compounds to pathways and to create reliable predictive metabolic models for translational and clinical research. These strategies include the construction and quality of tandem mass spectral databases such as the coalition of MassBank repositories and investigations of MS/MS matching confidence. We present in silico fragmentation tools such as MS-FINDER, CFM-ID, MetFrag, ChemDistiller and CSI:FingerID that can annotate compounds from existing structure databases and that have been used in the CASMI (critical assessment of small molecule identification) contests. Furthermore, the use of retention time models from liquid chromatography and the utility of collision cross-section modelling from ion mobility experiments are covered. Workflows and published examples of successfully annotated unknown compounds are included

    Exploring Machine Learning for Untargeted Metabolomics Using Molecular Fingerprints

    Get PDF
    Background Metabolomics, the study of substrates and products of cellular metabolism, offers valuable insights into an organism's state under specific conditions and has the potential to revolutionise preventive healthcare and pharmaceutical research. However, analysing large metabolomics datasets remains challenging, with available methods relying on limited and incompletely annotated metabolic pathways. Methods This study, inspired by well-established methods in drug discovery, employs machine learning on metabolite fingerprints to explore the relationship of their structure with responses in experimental conditions beyond known pathways, shedding light on metabolic processes. It evaluates fingerprinting effectiveness in representing metabolites, addressing challenges like class imbalance, data sparsity, high dimensionality, duplicate structural encoding, and interpretable features. Feature importance analysis is then applied to reveal key chemical configurations affecting classification, identifying related metabolite groups. Results The approach is tested on two datasets: one on Ataxia Telangiectasia and another on endothelial cells under low oxygen. Machine learning on molecular fingerprints predicts metabolite responses effectively, and feature importance analysis aligns with known metabolic pathways, unveiling new affected metabolite groups for further study. Conclusion In conclusion, the presented approach leverages the strengths of drug discovery to address critical issues in metabolomics research and aims to bridge the gap between these two disciplines. This work lays the foundation for future research in this direction, possibly exploring alternative structural encodings and machine learning models

    A global role for KLF1 in erythropoiesis revealed by ChIP-seq in primary erythroid cells

    Get PDF
    KLF1 regulates a diverse suite of genes to direct erythroid cell differentiation from bipotent progenitors. To determine the local cis-regulatory contexts and transcription factor networks in which KLF1 operates, we performed KLF1 ChIP-seq in the mouse. We found at least 945 sites in the genome of E14.5 fetal liver erythroid cells which are occupied by endogenous KLF1. Many of these recovered sites reside in erythroid gene promoters such as Hbb-bl, but the majority are distant to any known gene. Our data suggests KLF1 directly regulates most aspects of terminal erythroid differentiation including production of alpha- and beta-globin protein chains, heme biosynthesis, coordination of proliferation and anti-apoptotic pathways, and construction of the red cell membrane and cytoskeleton by functioning primarily as a transcriptional activator. Additionally, we suggest new mechanisms for KLF1 cooperation with other transcription factors, in particular the erythroid transcription factor GATA1, to maintain homeostasis in the erythroid compartment

    Blueprint: descrição da complexidade da regulação metabólica através da reconstrução de modelos metabólicos e regulatórios integrados

    Get PDF
    Tese de doutoramento em Biomedical EngineeringUm modelo metabólico consegue prever o fenótipo de um organismo. No entanto, estes modelos podem obter previsões incorretas, pois alguns processos metabólicos são controlados por mecanismos reguladores. Assim, várias metodologias foram desenvolvidas para melhorar os modelos metabólicos através da integração de redes regulatórias. Todavia, a reconstrução de modelos regulatórios e metabólicos à escala genómica para diversos organismos apresenta diversos desafios. Neste trabalho, propõe-se o desenvolvimento de diversas ferramentas para a reconstrução e análise de modelos metabólicos e regulatórios à escala genómica. Em primeiro lugar, descreve-se o Biological networks constraint-based In Silico Optimization (BioISO), uma nova ferramenta para auxiliar a curação manual de modelos metabólicos. O BioISO usa um algoritmo de relação recursiva para orientar as previsões de fenótipo. Assim, esta ferramenta pode reduzir o número de artefatos em modelos metabólicos, diminuindo a possibilidade de obter erros durante a fase de curação. Na segunda parte deste trabalho, desenvolveu-se um repositório de redes regulatórias para procariontes que permite suportar a sua integração em modelos metabólicos. O Prokaryotic Transcriptional Regulatory Network Database (ProTReND) inclui diversas ferramentas para extrair e processar informação regulatória de recursos externos. Esta ferramenta contém um sistema de integração de dados que converte dados dispersos de regulação em redes regulatórias integradas. Além disso, o ProTReND dispõe de uma aplicação que permite o acesso total aos dados regulatórios. Finalmente, desenvolveu-se uma ferramenta computacional no MEWpy para simular e analisar modelos regulatórios e metabólicos. Esta ferramenta permite ler um modelo metabólico e/ou rede regulatória, em diversos formatos. Esta estrutura consegue construir um modelo regulatório e metabólico integrado usando as interações regulatórias e as ligações entre genes e proteínas codificadas no modelo metabólico e na rede regulatória. Além disso, esta estrutura suporta vários métodos de previsão de fenótipo implementados especificamente para a análise de modelos regulatórios-metabólicos.Genome-Scale Metabolic (GEM) models can predict the phenotypic behavior of organisms. However, these models can lead to incorrect predictions, as certain metabolic processes are controlled by regulatory mechanisms. Accordingly, many methodologies have been developed to extend the reconstruction and analysis of GEM models via the integration of Transcriptional Regulatory Network (TRN)s. Nevertheless, the perspective of reconstructing integrated genome-scale regulatory and metabolic models for diverse prokaryotes is still an open challenge. In this work, we propose several tools to assist the reconstruction and analysis of regulatory and metabolic models. We start by describing BioISO, a novel tool to assist the manual curation of GEM models. BioISO uses a recursive relation-like algorithm and Flux Balance Analysis (FBA) to evaluate and guide debugging of in silico phenotype predictions. Hence, this tool can reduce the number of artifacts in GEM models, decreasing the burdens of model refinement and curation. A state-of-the-art repository of TRNs for prokaryotes was implemented to support the reconstruction and integration of TRNs into GEM models. The ProTReND repository comprehends several tools to extract and process regulatory information available in several resources. More importantly, this repository contains a data integration system to unify the regulatory data into standardized TRNs at the genome scale. In addition, ProTReND contains a web application with full access to the regulatory data. Finally, we have developed a new modeling framework to define, simulate and analyze GEnome-scale Regulatory and Metabolic (GERM) models in MEWpy. The GERM model framework can read a GEM model, as well as a TRN from different file formats. This framework assembles a GERM model using the regulatory interactions and Genes-Proteins-Reactions (GPR) rules encoded into the GEM model and TRN. In addition, this modeling framework supports several methods of phenotype prediction designed for regulatory-metabolic models.I would like to thank Fundação para a Ciência e Tecnologia for the Ph.D. studentship I was awarded with (SFRH/BD/139198/2018)


    Get PDF
    Thesis (Ph.D.) - Indiana University, Informatics and Computing, 2016Prediction of unknown drug target interactions from bioassay data is critical not only for the understanding of various interactions but also crucial for the development of new drugs and repurposing of old ones. Conventional methods for prediction of such interactions can be divided into 2D based and 3D based methods. 3D methods are more CPU expensive and require more manual interpretation whereas 2D methods are actually fast methods like machine learning and similarity search which use chemical fingerprints. One of the problems of using traditional machine learning based method to predict drug-target pairs is that it requires a labeled information of true and false interactions. One of the major problems of supervised learning methods is selection on negative samples. Unknown drug target interactions are regarded as false interactions, which may influence the predictive accuracy of the model. To overcome this problem network based methods has become an effective tool in predicting the drug target interactions overcoming the negative sampling problem. In this dissertation study, I will describe traditional machine learning methods and 3D methods of pharmacophore modeling for drug target prediction and will show how these methods work in a drug discovery scenario. I will then introduce a new framework for drug target prediction based on bipartite networks of drug target relations known as Random Walk with Restart (RWR). RWR integrates various networks including drug– drug similarity networks, protein-protein similarity networks and drug- target interaction networks into a heterogeneous network that is capable of predicting novel drug-target relations. I will describe how chemical features for measuring drug-drug similarity do not affect performance in predicting interactions and further show the performance of RWR using an external dataset from ChEMBL database. I will describe about further implementations of RWR approach into multilayered networks consisting of biological data like diseases, tissue based gene expression data, protein- complexes and metabolic pathways to predict associations between human diseases and metabolic pathways which are very crucial in drug discovery. I have further developed a software tool package netpredictor in R (standalone and the web) for unipartite and bipartite networks and implemented network-based predictive algorithms and network properties for drug-target prediction. This package will be described

    Gene Expression Response to Stony Coral Tissue Loss Disease Transmission in M. cavernosa and O. faveolata From Florida

    Get PDF
    Since 2014, corals within Florida’s Coral Reef have been dying at an unprecedented rate due to stony coral tissue loss disease (SCTLD). Here we describe the transcriptomic outcomes of three different SCTLD transmission experiments performed at the Smithsonian Marine Station and Mote Marine Laboratory between 2019 and 2020 on the corals Orbicella faveolata and Montastraea cavernosa. Overall, diseased O. faveolata had 2194 differentially expressed genes (DEGs) compared with healthy colonies, whereas diseased M. cavernosa had 582 DEGs compared with healthy colonies. Many significant DEGs were implicated in immunity, extracellular matrix rearrangement, and apoptosis. These included, but not limited to, peroxidases, collagens, Bax-like, fibrinogen-like, protein tyrosine kinase, and transforming growth factor beta. A gene module was identified that was significantly correlated to disease transmission. This module possessed many apoptosis and immune genes with high module membership indicating that a complex apoptosis and immune response is occurring in corals during SCTLD transmission. Overall, we found that O. faveolata and M. cavernosa exhibit an immune, apoptosis, and tissue rearrangement response to SCTLD. We propose that future studies should focus on examining early time points of infection, before the presence of lesions, to understand the activating mechanisms involved in SCTLD