25 research outputs found

    Computational methods for breath metabolomics in clinical diagnostics

    Get PDF
    For a long time, human odors and vapors have been known for their diagnostic power. Therefore, the analysis of the metabolic composition of human breath and odors creates the opportunity for a non-invasive tool for clinical diagnostics. Innovative analytical technologies to capture the metabolic profile of a patient’s breath are available, such as, for instance, the ion mobility spectrometry coupled to a multicapilary collumn. However, we are lacking automated systems to process, analyse and evaluate large clinical studies of the human exhaled air. To fill this gap, a number of computational challenges need to be addressed. For instance, breath studies generate large amounts of heterogeneous data that requires automated preprocessing, peak-detection and identification as a basis for a sophisticated follow up analysis. In addition, generalizable statistical evaluation frameworks for the detection of breath biomarker profiles that are robust enough to be employed in routine clinical practice are necessary. In particular since breath metabolomics is susceptible to specific confounding factors and background noise, similar to other clinical diagnostics technologies. Moreover, spesific manifestations of disease stages and progression, may largely influence the breathomics profiles. To this end, this thesis will address these challenges to move towards more automatization and generalization in clinical breath research. In particular I present methods to support the search for biomarker profiles that enable a non-invasive detection of diseases, treatment optimization and prognosis to provide a new powerful tool for precision medicine.Seit jeher ist bekannt, dass Körpergeruch und der Atem Hinweise zu deren Gesundheitszustand liefern können. Eine Analyse der Atemluft auf molekularer Ebene verspricht daher neue Ansätze zur Diagnose spezifischer Krankheiten. Innovative Technologien wie die Ionen Mobilitäts Spectrometrie in Kombination mit einer Multikapilarsäule, erlauben erstmals hochauflösende metabolische Profile der Atemluft innerhalb kürzester Zeit zu erzeugen. Zur Zeit fehlen jedoch die notwendigen computergestützten Applikationen zur automatischen Organisation und Auswertung der generierten Daten. Eine besondere Herausforderung stellen dabei die großen Mengen heterogenener klinischer und analytischer Daten und deren Verarbeitung. Ähnlich wie andere Hochdurchsatzverfahren unterliegt die Atemluft dem Einfluss von Hintergrundsignalen wie der Umgebungsluft oder Anderen die Ergebnisse verzerrenden Faktoren, wie zum Beispiel Ernährung, Lebensgewohnheiten oder Medikation. Dies erfordert den Einsatz von modernen Methoden der Statistik und des maschinellen Lernens, um robuste und generalisierbare Krankheitsmarker zu identifizieren. Ein besonderer Augenmerk gilt hierbei auch Krankheiten deren metabolischer Fingerabdruck sich im Krankheitsverlauf drastisch verändern können. Das Ziel meiner Arbeit ist es Lösungen für die beschriebenen Probleme zu finden und damit die Suche nach praxistauglichen Krankheitsmarkern mit bioinformatischen Methoden zu unterstützen. Im Rahmen mehrerer Studien und Softwareprojekten wurden grundlegende Methodiken vorgestellt, evaluiert und etabliert, insbesondere im Hinblick auf die Entwicklung computergestützter Systeme zur automatischen Analyse von Atemluftdaten. Die vorgestellten Verfahren legen den Grundstein für die nicht invasive Detektion von Krankheiten, Optimierung und Prognose von Behandlungen und darüber hinaus für ein weiteres Werkzeug der personalisierten Medizin

    An integrative clinical database and diagnostics platform for biomarker identification and analysis in ion mobility spectra of human exhaled air

    Get PDF
    Over the last decade the evaluation of odors and vapors in human breath has gained more and more attention, particularly in the diagnostics of pulmonary diseases. Ion mobility spectrometry coupled with multi-capillary columns (MCC/IMS), is a well known technology for detecting volatile organic compounds (VOCs) in air. It is a comparatively inexpensive, non-invasive, high-throughput method, which is able to handle the moisture that comes with human exhaled air, and allows for characterizing of VOCs in very low concentrations. To identify discriminating compounds as biomarkers, it is necessary to have a clear understanding of the detailed composition of human breath. Therefore, in addition to the clinical studies, there is a need for a flexible and comprehensive centralized data repository, which is capable of gathering all kinds of related information. Moreover, there is a demand for automated data integration and semi-automated data analysis, in particular with regard to the rapid data accumulation, emerging from the high-throughput nature of the MCC/IMS technology. Here, we present a comprehensive database application and analysis platform, which combines metabolic maps with heterogeneous biomedical data in a well-structured manner. The design of the database is based on a hybrid of the entity-attribute-value (EAV) model and the EAV-CR, which incorporates the concepts of classes and relationships. Additionally it offers an intuitive user interface that provides easy and quick access to the platform's functionality: automated data integration and integrity validation, versioning and roll-back strategy, data retrieval as well as semi-automatic data mining and machine learning capabilities. The platform will support MCC/IMS-based biomarker identification and validation. The software, schemata, data sets and further information is publicly available at \urlhttp://imsdb.mpi-inf.mpg.de

    Computational methods for metabolomic data analysis of ion mobility spectrometry data-reviewing the state of the art

    Get PDF
    Ion mobility spectrometry combined with multi-capillary columns (MCC/IMS) is a well known technology for detecting volatile organic compounds (VOCs). We may utilize MCC/IMS for scanning human exhaled air, bacterial colonies or cell lines, for example. Thereby we gain information about the human health status or infection threats. We may further study the metabolic response of living cells to external perturbations. The instrument is comparably cheap, robust and easy to use in every day practice. However, the potential of the MCC/IMS methodology depends on the successful application of computational approaches for analyzing the huge amount of emerging data sets. Here, we will review the state of the art and highlight existing challenges. First, we address methods for raw data handling, data storage and visualization. Afterwards we will introduce de-noising, peak picking and other pre-processing approaches. We will discuss statistical methods for analyzing correlations between peaks and diseases or medical treatment. Finally, we study up-to-date machine learning techniques for identifying robust biomarker molecules that allow classifying patients into healthy and diseased groups. We conclude that MCC/IMS coupled with sophisticated computational methods has the potential to successfully address a broad range of biomedical questions. While we can solve most of the data pre-processing steps satisfactorily, some computational challenges with statistical learning and model validation remain

    Computational strategies to combat COVID-19: useful tools to accelerate SARS-CoV-2 and coronavirus research

    Get PDF
    SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causesthe infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformaticstools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need for fast detection,understanding and treatment of COVID-19. To control the ongoing COVID-19 pandemic, it is of utmost importance to getinsight into the evolution and pathogenesis of the virus. In this review, we cover bioinformatics workflows and tools for theroutine detection of SARS-CoV-2 infection, the reliable analysis of sequencing data, the tracking of the COVID-19 pandemicand evaluation of containment measures, the study of coronavirus evolution, the discovery of potential drug targets anddevelopment of therapeutic strategies. For each tool, we briefly describe its use case and how it advances researchspecifically for SARS-CoV-2.Fil: Hufsky, Franziska. Friedrich Schiller University Jena; AlemaniaFil: Lamkiewicz, Kevin. Friedrich Schiller University Jena; AlemaniaFil: Almeida, Alexandre. the Wellcome Sanger Institute; Reino UnidoFil: Aouacheria, Abdel. Centre National de la Recherche Scientifique; FranciaFil: Arighi, Cecilia. Biocuration and Literature Access at PIR; Estados UnidosFil: Bateman, Alex. European Bioinformatics Institute. Head of Protein Sequence Resources; Reino UnidoFil: Baumbach, Jan. Universitat Technical Zu Munich; AlemaniaFil: Beerenwinkel, Niko. Universitat Technical Zu Munich; AlemaniaFil: Brandt, Christian. Jena University Hospital; AlemaniaFil: Cacciabue, Marco Polo Domingo. Instituto Nacional de Tecnología Agropecuaria. Centro de Investigación En Ciencias Veterinarias y Agronómicas. Instituto de Agrobiotecnología y Biología Molecular. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Agrobiotecnología y Biología Molecular; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Chuguransky, Sara Rocío. European Bioinformatics Institute; Reino Unido. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Drechsel, Oliver. Robert Koch-Institute; AlemaniaFil: Finn, Robert D.. Biocurator for Pfam and InterPro databases; Reino UnidoFil: Fritz, Adrian. Helmholtz Centre for Infection Research; AlemaniaFil: Fuchs, Stephan. Robert Koch-Institute; AlemaniaFil: Hattab, Georges. University Marburg; AlemaniaFil: Hauschild, Anne Christin. University Marburg; AlemaniaFil: Heider, Dominik. University Marburg; AlemaniaFil: Hoffmann, Marie. Freie Universität Berlin; AlemaniaFil: Hölzer, Martin. Friedrich Schiller University Jena; AlemaniaFil: Hoops, Stefan. University of Virginia; Estados UnidosFil: Kaderali, Lars. University Medicine Greifswald; AlemaniaFil: Kalvari, Ioanna. European Bioinformatics Institute; Reino UnidoFil: von Kleist, Max. Robert Koch-Institute; AlemaniaFil: Kmiecinski, Renó. Robert Koch-Institute; AlemaniaFil: Kühnert, Denise. Max Planck Institute for the Science of Human History; AlemaniaFil: Lasso, Gorka. Albert Einstein College of Medicine; Estados UnidosFil: Libin, Pieter. Hasselt University; BélgicaFil: List, Markus. Universitat Technical Zu Munich; AlemaniaFil: Löchel, Hannah F.. University Marburg; Alemani

    Computational strategies to combat COVID-19: useful tools to accelerate SARS-CoV-2 and coronavirus research

    Get PDF
    SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need for fast detection, understanding and treatment of COVID-19. To control the ongoing COVID-19 pandemic, it is of utmost importance to get insight into the evolution and pathogenesis of the virus. In this review, we cover bioinformatics workflows and tools for the routine detection of SARS-CoV-2 infection, the reliable analysis of sequencing data, the tracking of the COVID-19 pandemic and evaluation of containment measures, the study of coronavirus evolution, the discovery of potential drug targets and development of therapeutic strategies. For each tool, we briefly describe its use case and how it advances research specifically for SARS-CoV-2. All tools are free to use and available online, either through web applications or public code repositories.Peer Reviewe

    Carotta : revealing hidden confounder markers in metabolic breath profiles

    Get PDF
    Computational breath analysis is a growing research area aiming at identifying volatile organic compounds (VOCs) in human breath to assist medical diagnostics of the next generation. While inexpensive and non-invasive bioanalytical technologies for metabolite detection in exhaled air and bacterial/fungal vapor exist and the first studies on the power of supervised machine learning methods for profiling of the resulting data were conducted, we lack methods to extract hidden data features emerging from confounding factors. Here, we present Carotta, a new cluster analysis framework dedicated to uncovering such hidden substructures by sophisticated unsupervised statistical learning methods. We study the power of transitivity clustering and hierarchical clustering to identify groups of VOCs with similar expression behavior over most patient breath samples and/or groups of patients with a similar VOC intensity pattern. This enables the discovery of dependencies between metabolites. On the one hand, this allows us to eliminate the effect of potential confounding factors hindering disease classification, such as smoking. On the other hand, we may also identify VOCs associated with disease subtypes or concomitant diseases. Carotta is an open source software with an intuitive graphical user interface promoting data handling, analysis and visualization. The back-end is designed to be modular, allowing for easy extensions with plugins in the future, such as new clustering methods and statistics. It does not require much prior knowledge or technical skills to operate. We demonstrate its power and applicability by means of one artificial dataset. We also apply Carotta exemplarily to a real-world example dataset on chronic obstructive pulmonary disease (COPD). While the artificial data are utilized as a proof of concept, we will demonstrate how Carotta finds candidate markers in our real dataset associated with confounders rather than the primary disease (COPD) and bronchial carcinoma (BC). Carotta is publicly available at http://carotta.compbio.sdu.dk

    On the importance of statistics in breath analysis--hope or curse?

    No full text
    As we saw at the 2013 Breath Analysis Summit, breath analysis is a rapidly evolving field. Increasingly sophisticated technology is producing huge amounts of complex data. A major barrier now faced by the breath research community is the analysis of these data. Emerging breath data require sophisticated, modern statistical methods to allow for a careful and robust deduction of real-world conclusions

    Two different approaches for pharmacokinetic modeling of exhaled drug concentrations

    Get PDF
    Online measurement of drug concentrations in patient's breath is a promising approach for individualized dosage. A direct transfer from breath- to blood-concentrations is not possible. Measured exhaled concentrations are following the blood-concentration with a delay in non-steady-state situations. Therefore, it is necessary to integrate the breath-concentration into a pharmacological model. Two different approaches for pharmacokinetic modelling are presented. Usually a 3-compartment model is used for pharmacokinetic calculations of blood concentrations. This 3-compartment model is extended with a 2-compartment model based on the first compartment of the 3-compartment model and a new lung compartment. The second approach is to calculate a time delay of changes in the concentration of the first compartment to describe the lung-concentration. Exemplarily both approaches are used for modelling of exhaled propofol. Based on time series of exhaled propofol measurements using an ion-mobility-spectrometer every minute for 346 min a correlation of calculated plasma and the breath concentration was used for modelling to deliver R2 = 0.99 interdependencies. Including the time delay modelling approach the new compartment coefficient ke0lung was calculated to ke0lung = 0.27 min−1 with R2 = 0.96. The described models are not limited to propofol. They could be used for any kind of drugs, which are measurable in patient's breath

    An Integrative Clinical Database and Diagnostics Platform for Biomarker Identification and Analysis in Ion Mobility Spectra of Human Exhaled Air

    No full text
    Over the last decade the evaluation of odors and vapors in human breath has gained more and more attention, particularly in the diagnostics of pulmonary diseases. Ion mobility spectrometry coupled with multi-capillary columns (MCC/IMS), is a well known technology for detecting volatile organic compounds (VOCs) in air. It is a comparatively inexpensive, non-invasive, high-throughput method, which is able to handle the moisture that comes with human exhaled air, and allows for characterizing of VOCs in very low concentrations. To identify discriminating compounds as biomarkers, it is necessary to have a clear understanding of the detailed composition of human breath. Therefore, in addition to the clinical studies, there is a need for a flexible and comprehensive centralized data repository, which is capable of gathering all kinds of related information. Moreover, there is a demand for automated data integration and semi-automated data analysis, in particular with regard to the rapid data accumulation, emerging from the high-throughput nature of the MCC/IMS technology. Here, we present a comprehensive database application and analysis platform, which combines metabolic maps with heterogeneous biomedical data in a well-structured manner. The design of the database is based on a hybrid of the entity-attribute- value (EAV) model and the EAV-CR, which incorporates the concepts of classes and relationships. Additionally it offers an intuitive user interface that provides easy and quick access to the platform’s functionality: automated data integration and integrity validation, versioning and roll-back strategy, data retrieval as well as semi-automatic data mining and machine learning capabilities. The platform will support MCC/IMS-based biomarker identification and validation. The software, schemata, data sets and further information is publicly available at http://imsdb.mpi-inf.mpg.de
    corecore