67 research outputs found

    Development of Database Assisted Structure Identification (DASI) Methods for Nontargeted Metabolomics

    No full text
    Metabolite structure identification remains a significant challenge in nontargeted metabolomics research. One commonly used strategy relies on searching biochemical databases using exact mass. However, this approach fails when the database does not contain the unknown metabolite (i.e., for unknown-unknowns). For these cases, constrained structure generation with combinatorial structure generators provides a potential option. Here we evaluated structure generation constraints based on the specification of: (1) substructures required (i.e., seed structures); (2) substructures not allowed; and (3) filters to remove incorrect structures. Our approach (database assisted structure identification, DASI) used predictive models in MolFind to find candidate structures with chemical and physical properties similar to the unknown. These candidates were then used for seed structure generation using eight different structure generation algorithms. One algorithm was able to generate correct seed structures for 21/39 test compounds. Eleven of these seed structures were large enough to constrain the combinatorial structure generator to fewer than 100,000 structures. In 35/39 cases, at least one algorithm was able to generate a correct seed structure. The DASI method has several limitations and will require further experimental validation and optimization. At present, it seems most useful for identifying the structure of unknown-unknowns with molecular weights <200 Da

    Evaluating Patients' Perspective on Metoclopramide with Text Mining

    No full text
    Pharmacovigilance attempts to detect, assess, understand, and prevent adverse effects or any other possible drug-related problems. All pharmacovigilance systems in existence today rely on voluntary reporting. The effectiveness of these systems are limited due to under reporting and reporting bias. A vast amount of patient generated data on possible adverse effects can be found on health related web forums and social media. It is possible to use these user generated data to augment traditional pharmacovigilance systems. The purpose of this study is to examine the usefulness of such data found on health related web forums by evaluating patients’ perspective on metoclopramide.Data was obtained from two popular health related forums Drugs.com and WebMD.com. Web scraping was used to obtain the necessary data in tabulated form.According to patients’ reports on Drugs.com, the most common uses of metoclopramide were for "migraine" and "nausea," while the least frequently reported usage was for "GERD,". Most frequently reported side effects included "anxiety" and "headache". Fatigue and akathisia were the adverse effects least frequently mentioned. According to patient perspectives from WebMD.com, the most reported indications for metoclopramide were "nausea" and "vomit", while the least reported indication was for "migraine". According to WebMD data, the most frequently reported adverse effects of metoclopramide were "spasm," "cough," "bloat" and "drowsiness" while the least reported adverse effect was "Parkinson’s"

    Evaluating Patients' Perspective on Metoclopramide with Text Mining

    No full text
    Pharmacovigilance attempts to detect, assess, understand, and prevent adverse effects or any other possible drug-related problems. All pharmacovigilance systems in existence today rely on voluntary reporting. The effectiveness of these systems are limited due to under reporting and reporting bias. A vast amount of patient generated data on possible adverse effects can be found on health related web forums and social media. It is possible to use these user generated data to augment traditional pharmacovigilance systems. The purpose of this study is to examine the usefulness of such data found on health related web forums by evaluating patients’ perspective on metoclopramide. Data was obtained from two popular health related forums Drugs.com and WebMD.com. Web scraping was used to obtain the necessary data in tabulated form.THIS DATASET IS ARCHIVED AT DANS/EASY, BUT NOT ACCESSIBLE HERE. TO VIEW A LIST OF FILES AND ACCESS THE FILES IN THIS DATASET CLICK ON THE DOI-LINK ABOV

    Development of Database Assisted Structure Identification (DASI) Methods for Nontargeted Metabolomics

    No full text
    Metabolite structure identification remains a significant challenge in nontargeted metabolomics research. One commonly used strategy relies on searching biochemical databases using exact mass. However, this approach fails when the database does not contain the unknown metabolite (i.e., for unknown-unknowns). For these cases, constrained structure generation with combinatorial structure generators provides a potential option. Here we evaluated structure generation constraints based on the specification of: (1) substructures required (i.e., seed structures); (2) substructures not allowed; and (3) filters to remove incorrect structures. Our approach (database assisted structure identification, DASI) used predictive models in MolFind to find candidate structures with chemical and physical properties similar to the unknown. These candidates were then used for seed structure generation using eight different structure generation algorithms. One algorithm was able to generate correct seed structures for 21/39 test compounds. Eleven of these seed structures were large enough to constrain the combinatorial structure generator to fewer than 100,000 structures. In 35/39 cases, at least one algorithm was able to generate a correct seed structure. The DASI method has several limitations and will require further experimental validation and optimization. At present, it seems most useful for identifying the structure of unknown-unknowns with molecular weights <200 Da

    Chemical Structure Identification in Metabolomics: Computational Modeling of Experimental Features

    No full text
    The identification of compounds in complex mixtures remains challenging despite recent advances in analytical techniques. At present, no single method can detect and quantify the vast array of compounds that might be of potential interest in metabolomics studies. High performance liquid chromatography/mass spectrometry (HPLC/MS) is often considered the analytical method of choice for analysis of biofluids. The positive identification of an unknown involves matching at least two orthogonal HPLC/MS measurements (exact mass, retention index, drift time etc.) against an authentic standard. However, due to the limited availability of authentic standards, an alternative approach involves matching known and measured features of the unknown compound with computationally predicted features for a set of candidate compounds downloaded from a chemical database. Computationally predicted features include retention index, ECOM50 (energy required to decompose 50% of a selected precursor ion in a collision induced dissociation cell), drift time, whether the unknown compound is biological or synthetic and a collision induced dissociation (CID) spectrum. Computational predictions are used to filter the initial “bin” of candidate compounds. The final output is a ranked list of candidates that best match the known and measured features. In this mini review, we discuss cheminformatics methods underlying this database search-filter identification approach

    In Silico Enzymatic Synthesis of a 400 000 Compound Biochemical Database for Nontargeted Metabolomics

    No full text
    Current methods of structure identification in mass-spectrometry-based nontargeted metabolomics rely on matching experimentally determined features of an unknown compound to those of candidate compounds contained in biochemical databases. A major limitation of this approach is the relatively small number of compounds currently included in these databases. If the correct structure is not present in a database, it cannot be identified, and if it cannot be identified, it cannot be included in a database. Thus, there is an urgent need to augment metabolomics databases with rationally designed biochemical structures using alternative means. Here we present the In Vivo/In Silico Metabolites Database (IIMDB), a database of in silico enzymatically synthesized metabolites, to partially address this problem. The database, which is available at http://metabolomics.pharm.uconn.edu/iimdb/, includes ∌23 000 known compounds (mammalian metabolites, drugs, secondary plant metabolites, and glycerophospholipids) collected from existing biochemical databases plus more than 400 000 computationally generated human phase-I and phase-II metabolites of these known compounds. IIMDB features a user-friendly web interface and a programmer-friendly RESTful web service. Ninety-five percent of the computationally generated metabolites in IIMDB were not found in any existing database. However, 21 640 were identical to compounds already listed in PubChem, HMDB, KEGG, or HumanCyc. Furthermore, the vast majority of these in silico metabolites were scored as biological using BioSM, a software program that identifies biochemical structures in chemical structure space. These results suggest that in silico biochemical synthesis represents a viable approach for significantly augmenting biochemical databases for nontargeted metabolomics applications
    • 

    corecore